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PREFACE TO THE FIRST EDITION. 


Tue theory of factorial analysis is mathematical in nature, but 
this book has been written so that it can, itis hoped, beread by 
those who have no mathematics beyond the usual secondary 
school knowledge. Readers are, however, urged to repeat 
some at least of the arithmetical calculations for themselves. 

It is probable that the subject-matter of this book may 
seem to teachers and administrators to be far removed from 
contact with the actual work of schools. I would like 
therefore to explain that the incentive to the study of 
factorial analysis comes in my case very largely from the 
practical desire to improve the selection of children for 
higher education. When I was thirteen years of age and 
finishing an elementary school education, I won a ** scholar- 
ship " to a secondary school in the neighbouring town, one 
of the early precursors of the present-day “‘ free places" 
in England. Ihave ever since then been greatly impressed 
by the influence that event has had on my life, and have 
spent a great deal of time in endeavouring to improve the 
methods of selecting pupils at that stage and in lessening 
the part played by chance. It was inevitable that I should 
be led to inquire into the use of intelligence tests for this 
purpose, and inevitable in due course that the possibilities 
of factorial analysis should also come under consideration. 
It seemed to me that before any practical use could be 
made of factorial analysis a very thoroughgoing examina- 
tion of its mathematical foundations was necessary. The 
present book is my attempt at this. ... It may seem remote 
from school problems. But much mathematical study and 
many calculations have to precede every improvement 
in engineering, and it will not be otherwise in the future 
with the social as well as with the physical sciences. 

Goprrey H. THOMSON 
Moray HOUSE, 
UNIVERSITY OF EDINBURGH, 
November 1938 


PREFACE TO THE FIFTH EDITION 


Ix earlier editions since the first, the chief changes in the 
second edition were that the original chapter on Simple 
Structure was expanded into three, to cover oblique 
factors and second-order factors, while Dr. D. N. Lawley 
provided a chapter on factor analysis by maximum like- 
lihood, and a corresponding section in the mathematical 
appendix. The main changes in the third edition con- 
cerned the identity of simple structure factors after 
univariate selection, and the relations between two sets of 
variates. In the fourth, the principal addition was of 
Lawley’s formule for the standard errors of individual 
residuals, and of factor loadings, when maximum likelihood 
methods have been used. 

In the present (the fifth) edition it has for the first time 
been possible to reset the whole book. This has permitted 
more extensive alterations to be made, and the oppor- 
tunity has been taken of rearranging the order of the chap- 
ters and recasting several of them, as well as inserting in 
their proper places in the text those pages which in former 
editions had to be added as appendices. Chapters V, 
VIII, and X will supply the minimum of technique, and the 
remainder of Parts II and III will give in addition a. descrip- 
tion of the methods of analysis using principal components, 
using the principle of maximum likelihood, or using 
Thurstone’s Simple Structure. 

I hope, however, that readers will not merely use the book 
as a set of recipes on how to carry out certain computations, 
but will study the geometrical explanations (twelve new 
diagrams have been added): and especially that they will 
ponder the implications of the two chapters, XVIII and 
XIX, on the influence of selection on factors, and the final 
two chapters on the sampling theory and certain funda- 
mental questions. 


Goprrey H. THOMSON 
UNIVERSITY OF EDINBURGH, 
April 1951 
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All science starts with hypotheses—in other words, 
with assumptions that are unproved, while they may be, 
and often are, erroneous; but which are better than 
nothing to the searcher after order in the maze of pheno- 
mena. 


T. H. HoxrnEY 


I am not insensible of the advantage which accrues to 
Applied Mathematics from the co-operation of the Pure 
Mathematician, and this co-operation is not infrequently 
called forth by the very imperfections of writers on Applied 
Mathematics. 


R. A. FISHER 


PART I 


THE TWO-FACTOR THEORY AND ITS 
EXTENSIONS 


F.a—l 


CHAPTER I 
THE THEORY OF TWO FACTORS 


1. Factor tests-—The object of this book is to give some 
account of the “ factorial analysis" of ability, as it is 
called. In actual practice at the present day this science 
is endeavouring ( 


with what hope of success is a matter of 
keen controversy) 


to arrive at an analysis of mind based 
on the mathematical treatment of experimental data 
obtained from tests of intelligence and of other qualities, 
and to improve voc 


ational and scholastic advice and 
prediction by making use 0 


f this analysis in individual 
cases. Itis a development of the * testing " movement— 
the movement in which experimenters endeavour to devise 
tests of intelligence and other qualities in the hope of 
sorting mankind, and especially children, into different 
categories for various practical purposes ; 


: educational (as 
in directing children into the school courses for which they 
are best suited) ; administrative 


(as in deciding that some 
persons are so weak-minded as to need lifelong institutional 
care); or vocational, etc. 


"There are many psychologis 


the scores in such tests, or in any a 
can (ever) return to a full picture of the individual; and 


Without entering into any discussion of the fundamental 
controversy which this denial reveals, everyone who has 
had anything to do with tests will readily agree that this 
ls certainly so at present in practice. But the tester may 
be allowed to try to make his modest diagram of the 
individual better, more useful, and if possible simpler. 
Now, the broadest fact about the results of “ tests” of 
all sorts, when a large number of them is given to a large 
number of people, is that every individual and every test 
is different from every other, and yet that there are certain 
rather vague similarities which run through groups of 
People or groups of tests, not very well marked off from 


ts who would deny that from 
deed from any analysis, we 
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one another but merging imperceptibly into neighbouring 
groups at their margins. To describe an individual ac- 
curately and completely one would have to administer to 
him all the thousand and one tests which have been or 
may be devised, and record his score in each, an impossible 
plan to carry out, and an unwieldy record to use even if 
obtained. Both practical necessity and the desire for 
theoretical simplification lead one to seek for a few tests 
which will describe the individual with sufficient accuracy, 
and possibly with complete accuracy if the right tests can 
be found. If, as has been said, there is some tendency 
for the tests to fall into groups, perhaps one test from each 
group may suffice. Such a set of tests might then be said 
to measure the “ factors " of the mind. 

2. Fictitious factors.—Actually the progress of the 
“ factorial " movement has been rather different, and the 
factors are not real but as it were fictitious tests which 
represent certain aspects of the whole mind. But con- 
ceivably it might have taken the more concrete form. In 
that case the “factor tests” finally decided upon (by 
whom, the reader will ask, and when * finally " ?) would 
be a set of standards which, like any other standards, would 
have to be kept inviolate, and unchanged except at rare 
intervals and for good reasons. Some tendency towards 
this there has been. The Binet scale of tests is almost an 
international standard, and there is a general agreement 
that it must not be changed except by certain people upon 
whose shoulders Binet's mantle has fallen, and only seldom 
and as little as possible even by them. But the Binet 
scale is a very complex entity, and rather represents many 
groups of tests than any one test. By “ factor tests " one 
would more naturally mean tests of a “ pure " nature, 
differing widely from one another so as to cover the whole 
personality adequately. And since actual tests always 
are more or less mixed, it is understandable why “ factors” 
pave ogie to be fictitious, not real, tests, to be each 
o qd i by various combinations of real tests so 
ieee nes eir unwanted aspects tend to cancel out, 
auprosinfatine t aspects to reinforce one another, the team 

£ to a measure of the pure “ factor." 


THE THEORY OF TWO FACTORS 5 


But how, the reader will ask, do we know a “ pure ” 
factor, how are we to tell when the actual tests approximate 
toit? To givea preliminary answer to that question we 
must go back to the pioneer work of Professor Charles 
Spearman in the early years of this century (Spearman, 
1904). The main idea which still rightly or wrongly, 
dominates factorial analysis was enunciated then by him, 
and practically all that has been done since has been either 
inspired or provoked by his writings. His discovery was 
that the “ coefficients of correlation " between tests tend 
to fall into “ hierarchical order," and he saw that this 
could be explained by his famous ** Theory of Two Factors.” 
These technical terms we must now explain. ro, 

3. Hierarchical order.—A. coefficient. of correlation is a 
number which indicates the degree of resemblance between 
two sets of n orscores. Ifa schoolmaster, for example, 
gives two examination papers to his class, say (1) in arith- 
metic and (2) in grammar, he will have two marks for every 
boy in the class. If the two sets of marks are identical 
the correlation is perfect, and the correlation coefficient, 
denoted by the symbol 7 is said to be +1. If by some 
curious chance the one list of marks is exactly like the 
other one upside down (the best boy at arithmetic being 
Worst at grammar, and so on), the correlation is still perfect, 
but negative, and 7j = — L If there is absolutely no 
resemblance between ‘the two lists, 712 = 0. If there is a 
Strong resemblance, but falling short of identity, "2. may 
equal -9; and so on. There is a method (the Bravais- 
Pearson) of calculating such coefficients, given the list. g 
marks.* Tests” can obviously be correlated just like 


* The “ product-moment formula "is 
sum (a2) è 
Dar / (sum (a3) X sum (2°) 
Where æ, and a, are the scores in the two tests, M tive) 
Average (so that approximately half the scores are o x 5 ns 
e sums are over the persons to whom the scores apply: 


quantity — 


easured from the 
and 


sum (2°) 
9i = number of persons 


is called the variance of Test 1, and % its standard gari If is 
Scores in each test are not only measured from their average, bu 
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examinations, and a convenient form in which to write 
down the intercorrelations of a number of tests is in a 
square chequer board with the names of the tests (say 
a, b, c . .) written along the two margins, thus : 


a b c d e f 
a è 48 “24 DA 42 "830 
b “48 . "32 "2 -56 "40 
c 24 :82 . "36 28 :20 
d 54 12 36 . 63. E 
€ 42 56 28 63 . 835 
fi 30 E :20 45 35 


Totals| 198 2:48 1:40 270 2:24 170 


— 


It was early found that such correlations tend to be 
positive, and it is of some interest to see which of a number 
of tests correlates most with the others. "This can be found 
by adding up the columns of the chequer board, when we 
see in the above example that the column referring to 
Test d has the highest total (2-70). The tests can then be 
rearranged and numbered in the order of these totals, thus : 


1 2 3 4 5 6 

| 4 b e a T c 

D per 72 63 -54 +45 — 36 

2 b | +72 ; 56 48 — 40 32 

3 e | 63 -56 $ 49 95 -28 

4 a | 54 -48 42 A 30 24 

5 f | 45 40 -85 30 . 20 
6 c 


| -80 -82 +28 24 20 


After the tests have been thus arranged, the tendency 
which Professor Spearman was the first to notice, and which 


are then divided through by their standard deviation, they are said 
to be standardized, and we represent them by z, and z, About 
two-thirds of them, then, lie between plus and minus one, With 
such scores Pearson's formula becomes— 


sum of the products 2,2, 
~ number of persons p 

In theoretical work, an even larger unit is used, namely cv p. 
With these units, the sum of the squares is unity, and the sum of the 
produets is the correlation coefficient. The scores are then said to 


be normalized, but note that this does not mean distributed in 2 
“normal ” or Gaussian manner. 


Ti = 
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he called ** hierarchical order,” is more easily seen. It is 

the tendency for the coefficients in any two columns to have 

a constant ratio throughout the column. Thus in our 

example, if we fix our attention on Columns a and f, say, 

Fd run (omitting the coefficients which have no partners) 
us : 


+54 E 
48 -40 
+42 BS 
24 20 


and every number on the right is five-sixths of its partner 


on the left. 

Our example is a fictiti 
hierarchical order in it has 
emphasize the point. It m 
tendency is as clear in actual exper 
at the time there were some who denied altogether the 
existence of any such tendency in actual data. Those who 
did so were, however, mistaken, although the tendency is 
not as strong as Professor Spearman would seem originally 
to have thought (Spearman and Hart, 1912). The follow- 
ing is a small portion of an actual table of correlation cocffi- 
cients* from those days (Brown, 1910, 309). (Complete 
tables must, of course, include many more tests ; in recent 
work as many as 57 in one table.) 

8 4 5 6 


ous one, and the tendency to... 
been made perfect in order to 
ust not be supposed that the 
imental data. Indeed, 


ge Ee 
abs] r 78 A5 27 59 -30 
9 | 48 " AB 28 “51 -24 
3 i ead 48 d +52 40 -38 
4 | 9T 28 59 E E -38 
a A YT A E mati 
6 | 30 -24 +38 -38 13 « 


where data for small "dae are 
[Co 


taken from experimental papers, neither criticism nor nent is 

in any way intended. Illustrations are restricted to few tests for 

economy of space and clearness of exposition, but in the experiments 

from which the data are taken many more tests are employed, and 
that of this book. 


the purpose may be quite different from 


* In this, as in other instances 
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4. G saturations.—This tendency to “ hierarchical order”? 
was explained by Professor Spearman by the hypothesis 
that all the correlations were due to one “ factor ” only, 
present in every test, but present in largest amount in the 
test at the head of the hierarchy. This factor is his famous 
“ g,” to which he gave only this algebraie name to avoid 
making any suggestions as to its nature, although in some 
papers and in The Abilities of Man he permitted himself 
to surmise what that nature might be. Each test had also 
a second factor present in it (but not to.be found elsewhere, 
except indeed in very similar varieties of the same test), 
whence the name, “ "Theory of Two Factors "—Treally one 
general factor, and innumerable Second or specific factors. 

It will be proved in the Mathematical Appendix* that 
this arrangement would actually give rise to * hierarchical 
order.” Meanwhile this can at least be made plausible. 
For if Test d has that column of correlations (the first 
in our table) with the other tests solely because ‘it is 
saturated with so-and-so much g; and if Test b has less g 
in it than d has, it seems likely enough that b’s column of 
correlations will all be smaller in that same proportion, 
We can, moreover, find what these “ saturations " with g 
are. For on the theory, each of our six tests contains the 
factor g, and another part which has nothing to do with 
causing correlation. Moreover, the higher the test is in 
the hierarchical ranking, the more it is ** saturated " with ra 
Imagine now a fictitious test which had no specific, a test 
for g and for nothing else, whose saturation with g is 100 per 
cent. or 1-0. This fictitious test would, of course, stand 
at the head of the hierarchy, above our six real tests, and 
its row of correlations with each of those tests (their 
“ saturations ”) would each be larger than any other in the 
same column. What values would these saturations take ? 

, Before we answer this, let us direct our attention to the 
diagonal cells of the “ matrix ” of correlations (as it is 
a Square or oblong set of numbers), 
ee ae p to the present left blank. Since 

9er in our matrix represents the correlation of the 
lumn and row it stands, there should 
hapter xviii, end of Section 6, page 283, 
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Bum 2 3 4 5 6 

Ball c | Tig Ta A. Tig Ts; To; 
1 TOR 72 63 -54 45 36 
2 n* | 37 : 56 48 40 32 
3 | Ta | 63 -56 42 -35 28 
4 n, em 48 ` 42 : “30 24 
Sam]. 75 E “40 35 -30 : 20 
6 To; | -36 -32 -28 324 :20 : 


be inserted in cach diagonal cell the number unity, repre- 
senting the correlation of a test with its own identical self. 
In these self-correlations, however, the specific factor of 
each test, of course, plays its part. These self-correlations 
of unity are the only correlations in the whole table in 
which specifics do play any part. These “ unities," there- 
fore, do not conform to the hierarchical rule of propor- 
tionality between the columns. 

But the case is different with the fictitious test of pure g. 
It has no specific, ‘and its se "correlation of unity should 
conform to the hierarchy. 1f, therefore, we call the 
“saturations ” of the other tests rjj, Toy Pops Tap "59° and feg 
we see that we must have, as we come down the first two 


columns within the matrix— 


ned 


J Tog Tag Tyg — T$ 7, ] 
s for each other column with the g 


ànd similar equation: l the g 
dicate that the six “ saturations 


column, which together in 
are— 

9 8 7 6 5 4 
. Furthermore, each correlation in the table is the product 


of two of these saturations. ‘Thus— 
72=-9 x 8 
42 = 7 X 6 
Tag = Tag X Tug 
The six tests can now be expresse 
4 kis ' 
quations : 9g + 743065; 
> -8g + *60055 
23 = E J "T1483 
= -6g + -80054 


d in the form of 


8a 
2, = 58 + 80085 
ty = "4g + “91756 


PF.A,—]* 
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Herein, each z represents the score of some person in the 
test indicated by the subscript, a score made up of that 
person’s g and specific in the proportions indicated by the 
coefficients. The scores are supposed measured from the 
average of all persons, being reckoned plus if above the 
average and minus if below; and so too are the factors g 
and the specifics. And each of them, tests and factors, is 
“ standardized,” i.e. measured in such units that the sum 
of the squares of all the scores equals the number of 
persons. This is achieved by dividing the raw scores by the 
“ standard deviation.” The saturations of the specifics 
are such that the sum of the squares of both saturations 
comes in each test to unity, the whole variance of that test. 


Mu 436 = 4/(1 — -9?) 


5. A weighted batterj.—This brief outline of the Theory 
of Two Factors must for the moment suffice. It is 
enough to enable the question to be answered which at the 
end of our Section 2 led to the digression. “ How,” the 
reader asked, “ do we know a pure factor, how are we to 
tell when the actual tests approximate to it?” In the 
"Two-factor Theory the important pure factor was g itself, 
and a test approximated to it the more, the higher it stood 
in the hierarchy. Its accuracy of measurement of g was 
indicated by its “ saturation.” And a battery of hier- 
archical tests could be weighted so as to have a combined 
saturation higher than that of any one member, each test 
for this purpose being weighted (as will be shown in Chapter 


XV) by a number proportional to m where 7;, is the 
— ng? 

p. xix). The battery 

ith g is then— 


g saturation of Test i (Abilities, 
saturation or multiple correlation w 


| S 
j FES 


where S = x. "€ 


Although g remained a fiction, yet 
of a weighted battery of tests 
could approach nearer and neare: 


a complex test, made up 
which were hierarchical, 
T to measuring it exactly, 
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as more tests were added to the hierarchy. Each test added 
would have to conform to the rule of proportionality in its 
correlations with the pre-existing battery. If it did not 
do so it would have to be rejected. ‘The battery at any 
stage would form a kind of definition of g, which it ap- 
proached although never reached. And a man’s weighted 
score in such a battery would be an estimate of his amount 
of g, his general intelligence. The factorial description of 
à man was at this period confined to one factor, since the 
specific factors were useless as description of any man. 
For one thing, they were innumerable ; and for another, 
being specific, they were only able to indicate how the man 
would perform in the very tests in which, as a matter of 
fact, we knew exactly how he had performed. 
. 6. Oval diagrams.—1t is convenient at this point to 
introduce a diagrammatic illus- 
tration which will be useful in the A 
less technical part of this book, e 
although like all illustrations it A 
aoe slay only as UU and js Fired 

nalogy must not be pushed too far. 
It we represent the two abilities, P 
Which are measured by tests, by e» 
two overlapping ovals as in A 
Figure 1, then the amount of the Figure 2. 
Overlap can be made to represent 
the degree to which these tests are 
correlated. If we call the whole 
area of cach oval the “ variance " 
of that ability, we shall be intro- 
ducing the reader to another 
technical term (of which a de- 
finition was given in the footnote 
to page 5). Here it need mean — 4, 
Sua more than the To Figure 3. 

mount ? ility. The 

Overlap we ae E iie “© covariance.” ee 
Variances are each equal to unity, then the covariance 1$ 
4e correlation coefficient. To make the diagram quantita- 
tive, we can indicate in figures the contents of each part of 


2 


AN 


ÁN 
ii 
N 
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the variance, as in the instance shown, which gives a 
correlation of 3%, or -6. If the separate parts of each 
variance (i.e. of each oval) do not add up to the same 
quantity, but to o; and v, say, then the covariance (the 
amount in the overlap) must be divided by 4/v,v2 in order 
to give the correlation. Thus, Figure 2 represents a 
correlation of 3 — 4/(4 x 9) =-5. No attempt is made 
in the diagrams to make the actual areas proportional to the 


parts of the variance, it is the numbers written in each cell 
Which matter. 


` The four abilities represented b 


y four tests can clearly 
overlap in a complicated wa 


y, as in Figure 3, which shows 


The early "Theory of Two F 
of the cells of such a diagram had 
marked g and s, the general and the specific factors. "The 


“variance " of each ability was in that theory completely 
accounted for by the variance due to g, and the variance 
due to s. 


7. Tetrad-differences.—In Section 3 it w 
the discovery made by P 
correlation coefficients in 
Same ratio as we go up 
That is to say, 


b and f, and fi 


any contents save those 


às explained that 
rofessor Spearman was that the 
two columns tend to be in the 
and down the p 
if we take the columns bel 
X our attention on the co 


air of columns. 
onging to Tests 


rrelations which 
b and f make with d and e, we have : 
uro ee 
d 9 45 
e -56 :35 
where NE pio 


This may be written— 


T2 X -85 — “45 x -56 =0 
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and in this form is called a “ tetrad-difference.” In 
symbols this one is— 

Tates — Tafe = 0 
Spearman's discovery may therefore be put thus : ** The 
tetrad-differences are, or tend to be, zero." It is clear that 
this will be so if, as we said was the case in the Theory of 
Two Factors, each correlation is the product of two cor- 
relations with g. For then the above tetrad-dilference 


becomes— 1 
Tag" og" eg! fo — Vag? fol eg" 09 

which is identically zero. The present-day test for hier- 
archieal order in a correlation matrix is to calculate all the 
tetrad-differences (always avoiding the main diagonal) and 
see if they are sufficiently small. If they are, then the 
correlations can be explained by a diagram of the same 
nature as Figure 3, by one general factor and specifies. It 
is, of course, not to be expected in actual experimenting 
that the tetrad-differences will be-eaactly zero ; no experi- 
al can be as accurate as that. What 
all be clustered round zero in a 
dily in frequency as zero is 
-differences increases 


ment on human materi 
is required is that they sh 
narrow curve, falling off stea 
departed from. The number of tetrad 
very rapidly as the number of tests grows, and in an actual 
experimental battery the tetrads are very numerous indeed. 
In the small portion of a real correlation table given above 
(page 7), with six tests, there are 45 tetrad-differences,* 
and in this instance they are distributed as follows (taking 
absolute values only and disregarding signs, which can be 
changed by altering the order of the tests) : 


From :0000 to -0999, 28 tetrad-differences. 
From -1000 to +1999, 13 tetrad-differences. 
From -2000 to :2796, 4 tetrad-differences. 


; This distribution of tetrads can be represented by a 
“ histogram " like that shown in Figure 4, which explains 
itself. Tt is clear that some criterion is required by which 
We can know whether the distribution of tetrad-differences, 
after they have been calculated, is narrow enough to justify 
Us in assuming the Theory of Two Factors. This criterion 


* Not all independent. 
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is explained in Chapter III, page 41. One form of it con- 
sists in drawing a distribution curve to which, on grounds 
of sampling, the tetrad-differences may be expected to con- 
form. Any tetrad-differences which seem to be too large 
to be accounted for by the Theory of Two Factors are then 
examined, to see whether the tests giving them have any 
special points of resemblance, 
in content, method, or other- 
wise, which may explain why 
they disturb the hierarchy. 

8. Group faclors.—As time 
went on it became clear that 
the tendency to zero tetrad- 
differences, though strong, was 
not universal enough to permit 
"3^2 O4 -2-3 an explanation of all correla- 

Figure 4, tions between tests in terms of 

£ and specifies, with a few 

slight “ disturbers" in the form of slightly overlapping 
specifics. It became necessary to call in group factors, 
whieh run through many though not through all tests, 
to explain the deviations from strict hierarchical order. 
The Spearman school of experimenters, however, tend 
always to explain as much as possible by one central 
factor, and to use group factors only when necessitated. 
| They take the point of view that a group factor must, as 
it were, establish its right to existence, that the onus of 
proof is on him who asserts a group factor. 


As a tiny 
artificial illustration, a matrix of correlation coefficients : 
TEE Eo; EISE, Ln 
1 qu Va. TUE $ 
2 O AKEE ees ; 
Sare VEEE 
4 z5 E E 
would be examined, and its three tetrad-differences found 
to be: 
Zero 
*15 


15 
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Inspection shows that the correlation 755 is the cause of 
the discrepancies from zero, and the experimenter trained 
in the Two-factor school would therefore explain these 
correlations by a central factor running through them all, 
plus a special link joining Tests 2 and 8, as in Figure 5. 

There are innumerable other possible ways of explaining 
these same correlations. For 
example, the linkages between 
the tests might be as in Figure 6, 
which gives exactly the same cor- 
relations. This lack of unique- 
ness is something which must 
always be borne in mind in study- 
ing factorial analysis. There are 
always, as here, innumerable 
possible analyses, and the final 
decision between them has to be 
made on some other grounds. 
The decision may be psycho- 
logical, as when for example in 
the above case an experimenter 
chooses one of the possible dia- 
grams because it best agrees with 
his psychological ideas about the 
tests. Or the decision may be 
made on the ground that we 
should be parsimonious in ‘our 
invention of “ factors," and that 
where one general and one group factor will serve we should 
not invent five group factors as required by Figure 6. 
Both diagrams, however, fit the correlational facts exactly, 
and so also would hundreds of other diagrams which might 
be made. As has been said, the two-factor tendency is to 
take the diagram with the largest general factor (and the 
largest specifics also) and with as few group factors as 
possible. 

9. The verbal factor.—1n this way the Theory of Two 
Factors has gradually extended the “ two " to include, in 
addition to g and specifies, à number of other group factors, 
still, however, comparatively few. These group factors 


Figure 6. 
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bear such names as the verbal factor v, a mechanical factor 
m, an arithmetic factor, perseveration, etc. The charac- 
teristic method of the Two-factor school can be well 
seen, without any technical difficulties unduly obscuring 
the situation, in the search for a verbal factor. The idea 
that, in addition to a man’s g (which is generally thought 
of as something innate) there may be an acquired factor 
of verbal facility which enables him to do well in certain 
tests, is a not unnatural one. A battery of tests can be. 
assembled of which half do, and half do not, employ words 
in their construction or solution. The correlation matrix 
will then have four quadrants, the quadrant V containing 
the correlations of the verbal tests among themselves, the 


quadrant P the correlations of the non-verbal Or, say, 
pictorial tests, and the quadrants C containing the cross- 
correlations of the one kind of test with the other. If the 
whole table is sufficiently “ hierarchical,” there is no 
evidence for a group factor v or a group factor p. If 
either of these factors exists, there will be differences to be 


noticed between the six kinds of tetrad which can be 
chosen, namely : 


MED D ed PIP 
U . . v om p vm ] 
(1) (2) (8) 
v .- iy v coc p à m 
p DD 0p 
DAC DNE p 5 Ames | 
(4) (5) (6) 
RED. p v PEA 
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A tetrad like 1, with two verbal tests along one margin 
and two pictorial tests along the other, will be found in 
quadrant C. Neither a factor common to the verbal tests 
only, nor one common to the pictorial tests only, will add 
anything to any of the four correlations in such a tetrad- 
difference, which may be expected, therefore, to tend to be 
zero. If the tetrads in C seem to do so, the other tetrads 
can be examined. Tetrad 2 is taken wholly from the V 
quadrant. In it the verbal factor, if any is present, will 
reinforce all the four correlations, and should not therefore 
disturb very much the tendency to a zero tetrad-difference. 
(Reinforced correlations are marked by x in the diagrams.) 
The same is true of Tetrad 3 taken wholly from the P 
quadrant. Tetrads 4 and 5 have each two of their cor- 
relations reinforced, by the v factor in 4 and by the p 
factor in 5, but in each case in such a way as not to change 
very much the tetrad-difference. It is when we come to 
tetrads like 6, which have one correlation in each of the 
four quadrants, that the presence of either or both factors 
should show itself strongly : for the two reinforced correla- 
tions here occur on a diagonal, and inflate only the one 
member of the tetrad-difference— 

Tw pp — Top" po 

If, then, a verbal factor, and also a pictorial factor, are 
present, the tendency for the tetrad-differences to vanish 
should become less and less strong as we consider tetrads 
of the kinds 1, 2 and 8, 4 and 5, and especially 6, where 
the tetrad-differences should leap up. If only the verbal 
factor is present, tetrad-differences of the kind 3 should 
vanish rather more than those of the kind 2. But it will 
not be easy to distinguish between either suspected factor, 
and both. Tetrads like 6, however, should give conclusive 
evidence of the presence of one or the other, if not both. 
Methods like this were employed by Miss Davey (Davey, 
1926), who found a group factor, but not one running 
through all the verbal tests, and by Dr. Stephenson 
(Stephenson, 1931), whose results indicated the presence 
of a verbal factor.* 

* T. L. Kelley had already found by other methods strong evidence 
9f à verbal factor (Kelley, 1928, 104, 121 et passim). 
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—Just as the § saturations 
also can the saturation of a 
The general 
work with 
s which give no unduly large tetrad- 
» and which also appear to satisfy one's general 
impression that they test intelligence. From such a 
battery, of which the best example is that of Brown and 
Stephenson (B. and S., 1933), the § saturations can be 
caleulated.* Each test has, however, also its specific, 


ith some other battery of different 
these it may share a part of its 
up factor which will increase its 
caused by g. The excess correla- 
th this group 
hnical for this 

correspondingly 
reduced, Finally, able to give the 
composition of a test as, let us say (to invent an example)— 


vr d ZI EE 'D4n EH ATS 


Saturation Squared 


g . + 5041 
v "1600 
n "1156 
$ :2209 

1-0006 


* For the sake of clarity the text here 
Situation. The battery of Brown and Ste 
up factor as well as gand 


rather oversimplifies the 
phenson contains in fact 
Specifics, 


CHAPTER II 
BIFACTOR ANALYSIS AND CLUSTERS 


l. The bifactor method.—Holzinger’s Bifactor Method 
(Holzinger, 1935, 1937a) may be looked upon as another 
natural extension of the simple Two-factor plan of analysis. 
It endeavours to analyse a battery of tests into one general 
factor and a number of mutually exclusive group factors. 
A diagram of such an analysis looks like a “ hollow stair- 
case," thus : 


T'est g h k l 
1 x x 
2 x x 
3 x x 
4 X x 
5 x x 
6 P x 
7 x x 
8 x x 
9 X x 


Here factor g runs through all, as is indicated by the 
Column of crosses. Factors h, k, and J run through mutu- 
ally exclusive groups of tests each. The saturations with 
g can be calculated from sub-batteries of tests which form 
Perfect hierarchies, by selecting only one test from each 
group (in every possible way). After these are known, 
the correlation due to g can be removed, and then the. 
Saturations due to each group factor found. 

The following artificial example will illustrate some of 
the points of this method. Consider these correlations, 
Which to save space are printed without their decimal 
points : 

19 
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encode 50: § 9 TO 1l J2 
1 57 40 45 63 63 20 28 74 52 45 34 
2 57 34 25 58 39 17 44 68 43 39 56 
3 40 34 18 57 27 59 16 44 70 78 20 
4 45 25 18 27 51 09 12 82 22 20 15 
5 63 58 57 27 42 40 26 68 67 63 31 
6 63 89 27 51 42 18 18 50 34 30 28 
7 20 17 59 09 40 18 08 22 60 64 10 
8 28 44 16 12 26 18 08 85 21 18 43 
9 74 68 44 82 68 50 22 35 56 50 44 
10 52 48 70 22 67 84 60 21 56 78 25 
11 45 89 73 20 63 30 64 18 50 78 23 
12 34 56 20 15 31 23 10 43 44 25 28 


There are two stages in a bifactor analysis. The first 
problem is to decide how to group the tests so that those 
are brought together which share a second or group factor. 
Then the best method of caleulating is needed to find the 
loadings. . 

The grouping can partly be done subjectively by con- 
sidering the nature of each test and putting together 
memory tests, or tests involving number, and so on. 
Holzinger uses a “ coefficient of belonging,” B, to determine 
the coherence of a group. B is equal to the average of the 
intercorrelations of the group divided by their average 
correlation with the other tests in the battery. The higher 
B is, the more the group is distinguishable as a group. 
He begins with a pair of tests which correlate highly with 
one another, and finds their B. Then he adds a third test 
and finds the B of the three. Then another and another, 
until B drops too low. There is no fixed threshold for B, 
but a rather sudden drop would indicate the end of a 
group. 

2. Tryon’s grouping.—Another plan is to make a graph 
or profile of each row of correlations and compare these 
(Tryon, 1939), grouping together those tests with similar 
profiles. I find it easier to consider only the peaks of each 
row ard compare the rows with regard to these. If we 
mark, in each row of the above, the five highest correlations 
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in that row, and also the diagonal cell, we get the following 
set of peaks : 


| li 9.89 4-5 ARES SOMEONE TES 
1 se ex p Bex C NO 
2 x x x XX X 
3 | x vA EX P NA 
4 5 es X XS x 
5 Xx x x LIKA 
6 SOR DD ONES x 
7 x x X Se aX 
8 > x x eS x 
3) x. x me d ce? 
10 x x x Xtc es 
11 | x x x Neu Se SEX 
12 | xx x x Z RX x 


We then see that, in the rows, 

(a) Tests 3, 7, 10, 11 have identical peaks, 

U a AR 5 ob » 

(E) 4, 46 3 ó » 
and we take these as nuclei for three groups. There re- 
main Tests 1, 5, and 9. Their average correlations with 
each of the above nuclei are : 


1 
5 57 37 :35 
9 43 -49 E 


We therefore add Test 1 to group 6 Test 5 to group a, 
and (less certainly) Test 9 to group b. We then rewrite 
our matrix with the tests thus grouped (see next page) : 

It will be seen that certain additions have been made in 
readiness for the various methods of calculation of the g 
loadings which are then possible. If we symbolize the 


table overleaf as 


A - v a 
D 3 la P ae un T B 
E Z S 
fli sepe 
oN 
SS ie 
pw. MN E | 
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8 5 7 1011 


3 57 59 70 73 
5 |57 40 67 03 
7 | 59 40 60 64 
0 | 70 67 60 78 
1 | 73 63 64 78 | 


2 34 53 17 43 39 | 1-86 44 68 56 | 57 25 39 | 1:21 
8 |16 26 08 21 18 | -89 | 44 35 43 | 28 12 18 | -58 
9 | 44 68 22 56 50 |2-10| 68 35 44 | 74 32 50 | 1-56 
12 | 20 31 10 25 23 | 1-09 | 56 43 44 3415 23 | -72 
6-24 | | | 4-07 
1 | 40 63 20 52 45 | 2-20 | 57 28 74 34 | 1-93 45 63 
4 18 27 09 22 20 *96 | 25 12 32 15 ‘84 | 45 51 
6 | 27 42 13 34 30 | 1-46 | 39 18 50 23 |130 | 63 51 
4-02 | 4-07 


all methods depend on using only the correlations in the 


rectangles D, E, and F, since the suspected group factors. 


which increase the correlations in A, in B, and in C do not 
influence D, E, and F. Each correlation in the latter 


rectangles is therefore the product of two g-saturations 
(see page 9). Thus: 


Tap = 40 = ll, 
1505584 & LI 


eB 


where it should be noted that the three correlations come 
from E, D, and F respectively. 

But this value for the loading of Test 3 depends upon three 
correlations only and would, in a real experimental set of 
data, vary somewhat with our choice of the three. A 
method of using all the possible correlations in these three 


rectangles is needed. One such is given by Holzi i 
his Manual (19374). A Wier d 
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3. Holzinger’s formula.—1f all possible ways of choosing 


Tsis formed 


the two other tests are taken, and the fraction 
: m 
in each case ; and if the numerators of these fractions are 
added together to form a global numerator, and their 
denominators to form a global denominator; it will then 
be found that the fraction thus formed is equal to 
1-14 x ‘85 
4-07 
and this time all available correlations have been used. 
The rule is to multiply the two totals in the row of the 
test (1-14 x -85) and divide by the grand total of the 
block formed by the other tests concerned (1, 4, and 6 
with 2, 8, 9, and 12, i.e. 4-07). For Test 2 this rule gives 
1:86 x 1:21 
4-62 
This Holzinger method is not difficult to extend to four 
or more groups. If we symbolize a four-group matrix by 
A D E G 
D B F H 
E F Cc K 
G H K L 
test, then its g-loading lis given by 


Tete 
~F+H+K 
where d, e, and g are the sums of its row in D, E, and G. 
4. Burts formula.—Another method is given by Burt 
(1940, 478). For the numerator of each g loading he takes 
the sum of the side totals which Holzinger multiplied. 


Thus the numerators are : 
for Test 3, 1-14 + :85 = 1:99 
5, 178 + 1:32 = 8:10 


i = 94, lą = 49 


= -49, lą = “70. 


~ 
oe 


and consider the first 


y e 


12,109 + 72 — L-81 


6, 1:46 + 1:30 = 2-76. 
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The denominators differ in group a, group b, and group c, 
but all are formed from the three quantities 6-24, 4-62, 
and 4-07. For group a the denominator is : 


veor{ E: "n — 4-08. 


4-62 6-24. 


It will be seen that the two quantities within the curly 
brackets are the totals of D and E, the two rectangles 
from which the numerators of group a come. By analogy 
the reader can write down the denominators of group b 
and group c—they come to 4-40 and 5-01. Dividing the 


numerators by the appropriate denominators, we get for 
the g loadings : 


Test See LOM U2 S. (9) 19 aT v4. 16 
g Loading -49 -76 -24 -62 -55 “70 -833 -90 -41 -82 -36 -55 


The proof of Burt’s formula is surprisingly easy. If the 
reader will write down, in place of the correlations in D, 
E, and F, the literal symbols ll, (for rą)—since our 
hypothesis is that only g is concerned in these correlations 
—and will write out the sums, ete., of the above calculation 
literally, he will find that Burt's formula simplifies almost 
immediately to one J, that of the test in question. Burt 
only gives his formula for three groups. It can be extended 
to the case of more groups, but become 
rather unwieldy. 

5. The test of correct grouping.—Now comes the test of 
whether our grouping is correct, and our hypothesis valid 
that groups a, b, and c have nothing in common but the 
factor g. Using the loadings we have found, form all the 
products Ll, and subtract them from the experimental 
correlations. All the correlations in D, E, and F should 
then vanish or, in a real set of data (ours are artificial), 
become insignificant. There should, however, . remain 
residues in A, B, and C due to the second factors running 


through groups a, b, and c respectively. In our example 
the subtraction of the quantities L], gives the residues 
Shown at the top of page 25. 


The correlations left in A, 
other factor (now that g has b 


s eumbersome and 


if they are due to only one 
€en removed), ought to show 


ANALYSIS 


BIFACTOR 
8- 5 
g Loadings 49 76 24 
3 49 20 47 
5 6) 20 22 
7 24) 47 22 
10 62 40 20 45 
11 55 | 46 21 51 
2 70 
8 33 
9 90 
12 41 | 
1.782 
4 96 
6 55 


zero or very small tetrads ; and so they do. 
Those in C are too few to form a 


are also hierarchical. 
tetrad. The second factor in each of these submatrices 
can now be found in the same way as g is found from a 


matrix with no other factor : see page 9 and, later in this 
The reader should complete the 


book, pages 42 to 44. 


calculation, and will find these loadings : 


10 
62 


11 
55 


AND CLUSTERS 


9 
a 


8 9 12 
70 33 90 41 


25 
121427 06 
82 36 55 
| 
29 | 
14 | 
37 | 
|o 15:18 
15. “81 


| 18 81 


Those in B 


Factors 
Test g u v w 

3 49 “65 
5 ‘76 +30 
7 +24 72 

10 -62 -62 

11 -55 “71 . 
2 -70 "dd 
8 -33 AT 
9 -90 ah : 

12 “41 -62 . 
Te 82 "29 
4 86 :50 
6. bo e : “62 
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An actual set of data will not give so perfect a hollow 
staircase, but at this stage the strict bifactor hypothesis 
can be departed from and additional small loadings or 
further factors added to perfect the analysis. Where a 
bifactor pattern exists, a simple method of extracting 
correlated or oblique factors has been given by Holzinger 
(1944) “based on the idea that the centroid pattern 
coefficients for the sections of approximately unit rank 
may be interpreted as structure values for the entire 
matrix.” 

6. Cluster analysis.—This is connected with the bifactor 
method, which is possible when clusters do not overlap. 
But it is by no means rare to find two or three variables 
entering into several distinct clusters. Raymond Cattell’s 
article (1944a) describes four methods of determining 
clusters, and gives references which will lead the interested 
reader back to much of the previous work, and see also 
Tryon's work Cluster Analysis, 1939. The most naive 
method of classifying tests into clusters, one needing no 
mathematics whatever, is simply to put together all the 
tests which intercorrelate above a certain level. We can 
illustrate this adequately on the above example. Let us 
collect into clusters tests which correlate with one another 
at least 0-40. A routine is desirable to case the task and 
avoid overlooking any clusters. Turn to the table on 
page 20 and write down from the first row all the tests 


which have correlations of 0-40 or more with Test 1, 
including itself. 


o o'o o 
= 
© 


Cluster A, Tests 17275,9, 10; 
Then consider the test next to No. 1 in tbis line, which 
and go along its line in the correlation 


sufficiently with Test 2. 


They are 5, 9, and 10. The 
other tests of our first line 


drop out. We then look along 
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the line of Test 5’s correlation coefficients, and find that 
Tests 9 and 10 survive this scrutiny. Finally, we note 
that Tests 9 and 10 themselves correlate enough. The 
cluster A is therefore (reading down the left-hand edge of 
the-above triangular set of notes) composed. of Tests 1, 2, 
5,9, and 10. At this point, to avoid missing other clusters 
which may begin with Test 1, it is necessary to consider 
what would have happened had Test 2 not been in the 
battery. It would be tedious to describe the whole pro- 
cedure here, but the reader is urged to go through it, when 
he will find six clusters, shown in this diagram. 


Figure 7. 


7. Comparison with the bifactor groups.—If we compare 
these clusters with the grouping we found by Tryon’s 
method of profiles (or peaks), we see that our present clusters 
F, E, and C are those we arrived at formerly (except for the 
absence of Test 9 from cluster E). And we notice also 
that in our diagram these are mutually exclusive clusters. 
The missing Test 9 is the one we formerly had most doubt 
about classifying. The reason can be seen from the analy- 
Sis we have already made. It is highly saturated with the 
general factor, and only very weakly with the verbal 
factor which decides its bifactor group. 

8. A less artificial eeample.—The above example was an 
artificial one, made so as to “ come out" exactly. Let us 
turn to a more realistic example where this is not the case. 
The following correlations—decimal points are again 
Omitted—are from an actual report, but to obviate some 
embarrassments in a didactic example I have made all the 
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coefficients rather larger than they actually were. The 


first seven “ tests 
the next four are 


2 


5 


are examinations in school subjects, 
* non-verbal ” tests with simple pieces 


of apparatus, and the last three are special tests supposed 


to be uncontaminated by any group factor other than g 


B. 


v, and k (the “ space ” factor). 


1 Physies 

2 Chemistry . 
3 Mathematics 
4 French 

5 Mech. Draw. 
6 Problems 

7 Reading 

8 Koh's Blocks 
9 Cube Constr. 
10 Form Board 
11 Passalong . 
12 g test 

13 v test 
14 k test 


20383 ae 6 45 8D 30 Jl 12 Teo a4 
76 82 68 64 40 28 44 16 21 45 11 10 

76 68 62 52 26 26 4 29 23 38 15 Il? 
82 68 68 47 48 21 13 20 43 19 18 
68 62 68 45 23 34 2 13 05 26 34 00 
64 52 47 36 17 38 
40 26 48 36 19 € 20 4 
28 26 21 17 19 02 
44 43 37 53 51 09 50 
19 36 23 25 55 47 07 42 

16 29 13—13 38 20 02 
21 28 20 05 21 40 17 52 
45 38 43 26 36 47—07 94 82 40 57 
11 15 19 34 07 05 38 4 19 32 40 AS 
10 13 18 00 42 36 03 398 46 57 45 


When by the 


diagram : 


above method we sort these tests into clus- 
ters, using 0-40 as boundary line, we obtain the following 


In passing, we may note that this diagram illustrates 
what Raymond Cattell (1946) calls a nuclear cluster, i.e. 


one which forms the centre of 
clusters. Here the pair 8 an 
occur together in clusters B, C, 


a number of larger looser 
d 9 are never separated, 
D, and E, and are such a 
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nuclear cluster. For bifactor analysis, however, we want 
non-overlapping clusters. 

9. A first attempt at grouping.—Searching in this diagram 
for at least three non-overlapping contours, we find 
clusters A, F, and either C or D. Of the alternatives let 
us take D, and rewrite our table of correlations with these 
clusters separated. This leaves Tests 6 and 7 out of the 
picture, and further study of the diagram leads us also to 
omit 5, which is linked with both F and D through cluster B. 
Our table, and its calculations, then is as follows : 


ul on er AMN s 9 10 11 12 18 14 | 
| 
1 76 82 68 | 44 19 16 21 |1-00| 45 11 10 | -66 
2) 76 68 62 | | 43 36 29 23 |131| 38 15 13 | -66 
8| 82 68 68 | 37 23 13 20 | -93 | 43 19 18 | -80 
4| 68 62 68 | 29 25—13 05 | -46| 26 34 00 | :60 
l 
3 3-70 2:72 
8| 44 43 37 29 | 1-53 81 50 50 64 43 65 172 
9| 19 36 23 25 |103| 81 42 53 53 37 66 1:56 
10| 16 29 13—13 | -45 | 50 42 52 34 19 38 | :91 
11} 21 23 20 05 | -69| 50 53 52 32 82 46 | 110 
= | | 
ae 
| 3:70 | 5:29 
12| 45 ss 43 26 |1-52| 04 58 34 821883) 40 57 
18] 11 15 19 34 | -79| 48 87 19 82 |181|40 45 
14| 10 13 18 00 | -41| 05 06 38 46 |215| 57 49 | 
M : h Mss red BS — 
2-72 | 5:39 


From this table, by Holzinger's formula, we obtain the 
8 loadings shown at the right of the next table. For 
example : 
= 0-45 x 0:91 = 15055, lj = :388 
2-72 
e remove the parts of the 


When i i w 
m eer get the following table 


Correlations due to that factor, we 
of residues, For example : 


46 — -858 x 404 = :62. 
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Residues 

g 
1 DIES 59 30. I1 12 13 14 | Load- 
ings 
1 62 69 60| 09 —08 02 02| 14 —08 —07 | -353 
2) m 58 53| 08 05 18 02| -03 —06 —07| -404 
3| 69 53 59| 00 —06 —02 00| 10 —01  00| -375 
4| 60 53 59 07 07 —22 —07| 06 22 —11| -228 
8| 09 03 00 07 05 12 —02| —21 —09 17| -984 
9| —08 05 —06 07| 05 12 12| —14 —04 28] 769 
10| 02 13 —02 —22| 12 12 32| 00 —02 19| 388 
11| 02 02 00 —07| —02 12 32 —14 04 20] 528 
12, 14 08 10 06| 21 —14 00 —14 —06 15| -867 
13) —08 —06 —01 22| —09 —04 —02  04| —06 19 | :529 
14| —07 —07 00 —11| 17 28 19 20| 15 19 488 


On examining these residues, however, we see that this 
time our hypothesis, that the clusters are exclusive with 
regard to their second group factors, is not justified. True, 
many of the residues in the side squares are very small. 
But two facts strike the eye: Test 14 (the k or space 
factor test) has quite large residues with the middle or non- 
verbal group, and Tests 10 and 11 (Form Board and 
Passalong) have a much larger residue than the other 
tests in the middle square. These facts suggest further 
purging the battery of 14 and either 10 or 11. It is very 


Residues 

g 
1 2 3 4 8 9 11| 12 18 |Load- 
md ings 
1 57 64 52 08 —08 02 10 —11 | 424 
2 57 48 45 05 07 03| 00 —09 | -455 
3 64 48 52 00 —05 01 07 —04 | 436 
3] 52 45 52 —02 02 —11| —05 15 | -368 
: 08 05 00 —02 28 13| —06 —01 | -842 
d —08 07 —05 99 28 25 00 04 | -633 
02 03 01 n 18 95 —04 09 | 437 

12 
is und 90 07 —o5 | —o6 00 —04 13 | -835 
eo C4) i 704 09 13 -522 


L 
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frequently necessary to “purge” a battery before the 
proper loadings of the remaining tests can be ascertained. 
10. The purged battery—When we do this (the reader 
should rewrite the tables and carry out the work), we get 
the loadings and residues shown at the foot of page 30. 
This table is much more like our artificial model. None 
of the correlation coefficients in the side squares are far 
from zero—we shall learn later how to decide whether they 
are, in fact, small enough to be ignored. Meanwhile, let us 
assume this, and suppose, that is to say, that these three 
groups of tests really are exclusive of one another in their 
second group factors. Their loadings in these we could 
then proceed to calculate. This is easily done in the middle 
group, where there are exactly three tests. We have: 


mi = NS DIR RE 11456, Mg = :882 
:38 x :25 
IN — = +5884, my = "784 
™ 13 1 
25 X 18 
mi = = 1161, my = “84 


The equations of these three tests are therefore: 
Zg = 8429 + -382h + +383 sg 

-633g + -784 + -246 ss 

my = 4978 + -841h + ‘832 Sy 


where the group factor common to them is given the non- 
committal name h. The coefficients of the specifics are 
settled by the fact that the sum of the squares of the co- 
efficients of such an equation (since the factors are inde- 
pendent) must equal unity. It will be noticed that Test 11 
(Passalong) has here a large specific. It probably shares a 
good deal of this with Test 10 (Form Board) which we 
excluded from the battery meanwhile for this very reason.* 
We cannot similarly calculate the group factor loadings of 
the third group of tests, for there are only two of them and 

* It should be repeated at this point that this example is purely 
illustrative, and no conclusions about actual tests may be drawn 
from this or from any of our examples. This is a book about 
factorial methods, not results. 


E 
I 
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three tests are necessary. We only know that the product 
of their two group factor loadings is -13. This emphasizes 
the necessity, in planning a bifactor battery, to have a 
sufficient number of tests. "There must be at least three 
groups, and at least three tests in each group. 

The first group has four tests, and our first step should be 
to see whether its tetrad-differences are zero. If they were 
exactly zero, it would be immaterial which three of the 
four tests we chose to calculate loadings from. Here the 
tetrad-differences, though small (-0084, -0384, -0468), are 
not exactly zero. We shall defer to the next chapter 
(page 43) the question of how to make the best estimate 
of the loadings under these circumstances, but the reader 
might care to calculate them from every possible three of 
the four tests and average the results. Our illustration has 
served its purpose of bringing to light difficulties which do 


not exist in an artificial example made to avoid raising 
them. 


CHAPTER III 


SAMPLING ERROR AND THE THEORY.OF TWO 
FACTORS 


1. Sampling error.—The general idea underlying the 
notion of a sampling error is not a difficult one. Take, for 
example, the average height of all living Englishmen who 
are of fullage. This could, if need be, be ascertained by the 
process of measuring every living Englishman of full age. 
Actually this has never been done, and when anyone makes 
a statement such as “ The average height of Englishmen is 
674 inches,” he is basing it upon a sample only. This 
sample may not be an unbiased one. Indeed, samples of 
Englishmen whose height has been officially recorded are 
heavily loaded with certain classes of Englishmen—for 
example, prisoners in gaol, and unemployed young men 
joining the army of preconscription days. The average 
height of such men may well differ from that of all English- 
men. But when we speak of sampling error, we do not 
mean error due to the sample being known to be a biased 
one. Even if the sample of Englishmen used to find the 
average height of their race were; as far as could be seen, a 
perfectly fair sample,.containing the proper proportion of 
all classes of the community and of all adult ages, ete., it 
yet would not necessarily yield an average exactly equal 
to that of all Englishmen. Several apparent replicas of the 
sample would yield different averages. It is these differ- 
ences, between statisties gathered from different but 
equally good samples, that we mean by sampling errors. 

It is worth while calling attention at this point to a 
general fact which will be found of importance at a later 
Stage of this book. The true average height of Englishmen. 
is only so by definition, and does not in principle differ 
from the average of a sample. We had to define the popu- 
lation we had in mind as “all living Englishmen of full 
age.” This is a perfectly well-marked body of men. But 
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it is itself in its turn only a sample: a sample of all living 
Europeans, or all living men. It is, indeed, altering daily 
and hourly as men die or reach the age of 21, and each 
generation is a sample of those that have been and may be. 
Those who reach the age of 21 are only some, and therefore 
only a sample, of those born. And even those born are 
only a sample of those who might have been born had 
times been better or had there been no war, or a tax on 
bachelors. So the idea of sampling is a relative one, and 
the “complete population " from which we take samples 
is a matter of definition only. The mathematical problem 
in connexion with sampling which it is desirable to solve 
if possible for each statistic is to find the complete law of 
its distribution when it is derived from each of a large 
number of samples of a given size. Mathematically this 
is often very difficult, and frequently we have to be 
content with a formula which gives its approximate 
variance if certain assumptions are allowed and certain 
small quantities are neglected. 

Sampling problems are of two kinds, direct and inverse. 
The easier kind of problem is to say what the distribution 
of a statistic will be in samples of a given size when we 
know all about the true values in the whole population : 
the more difficult kind is to estimate what the true value 
of a statistic is in a complete population when we know 
its observed value in certain samples. They differ as 
do problems of interpolation and extrapolation. As an 
example of the direct kind of problem, let us suppose that 
we actually knew the height of every adult Englishman 
or n sis We could then, on being told a certain sample 
i me Ert eed such and such a height, calculate 
probability that cs Bolts D M ae 
BWlihe Garis dionet 3 moy grow less as the average 
population: T, ES ae rom the average of the whole 
for if a des [Pon u i so depend on the size of the sample, 
ML S gu e deviates far from the true average, 
Nude m ra € random, more likely to have some 
same average a E, than a small sample with the 

ave. 


2. S "^ 
Punddng errors.—By the distribution of a certain 
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variable in the population we mean the curve (usually 
expressed as an equation) showing its frequency of occur- 
rence for each possible value. Thus the curve in Figure 9 
might show the distribution of height in living adult 
Englishmen, by its height above the base line at each point. 
More men (represented by the line MN) have the average 
height, 671 inches, than have the height 73 inches, the 
frequency of the latter being shown by the line PQ. The 
shaded area represents all men whose height is 73 inches 
or more, and its ratio to the area under the whole curve 
is the probability that an Englishman taken absolutely at 
random will have a height of 73 inches or more. 

Very often distributions are, at any rate approximately, 
of a certain shape called the “ normal curve.” The normal 
curve has a known equation, it is symmetrical about its 
mid point, and with the aid of published tables can be 


drawn accurately (or 


reproduced arithmeti- N 

cally) if we know the 

mid point M (which S s' 

is the average of the Q 
measurements) and a 

certain distance ST or M P 
S'T (which is equal to bl O2 63 64 65 6b 67 65697071 1213747576 
the standard deviation Figuro 9. 


of the measurements). 
S and S' are the points where the curve changes from 


being convex to being concave. 

If the distribution of a variable, say the heights of adult 
Englishmen, is * normal" then the distribution of the 
means of samples of p Englishmen’s heights will also be 
normal, but will be more closely concentrated about the 
point M than are the measurements of individuals: in 
point of fact, its variance will be p times smaller, its 
Standard deviation thus 4/p times smaller. That is to 
Say, if we take sample after sample of 25 Englishmen 
€àch time, and for each sample record the average height, 
the means thus accumulated will be distributed in a curve 
9f the same shape as that of Figure 9, but narrower from 
Side to side, so that SS” would be one-fifth (4/25) of what 
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it is in Figure 9, which is the distribution of single 
measurements. 

If a sample were made with some special end in view, 
such as ascertaining whether red-headed men tend to be 
tall, we would decide whether we had detected such a 
tendency by calculating the probability that a mean such 
as our red-headed sample showed, or a mean still farther 
away from M, would occur at random. For this purpose 
we would compare the deviation of our sample from M 
with the standard deviation of the distribution of such 
samples, obtained by dividing the standard deviation of 
individuals by the square root of p, the number in the 
sample. The ratio of the deviation found, to the standard 
deviation, is the criterion, and the larger it is the more 
likely is it that red-headed men really do tend to be tall. 
For many practical purposes we take a deviation of over 
twice the standard deviation as “ significant." 

Sometimes the reader will find significance questions 
discussed in terms of the “ probable error ” instead of the 
standard deviation. The probable error is best considered 
as a conventional reduction of the standard deviation (or 
standard error, as it is sometimes called) to two-thirds of 
its value (more exactly, to -67449 of its value). 

Not only would the average height, or the average weight, 
of the sample of red-headed men differ from sample to 
sample. Statistics calculated in more complex ways from 
the measurements will also vary from sample to sample, 
as, for example, the variance of height, or the variance of 
weight, or the correlation of height and weight. Let us 
Consider first the variance of the heights. In the whole 
population this is calculated by finding the mean, expres- 
x every height as a plus or minus deviation from the 

n, 


Squaring all these deviations, and dividing the sum 
by the number in the population. 


: This is also how we would find the variance of the sample 
if we really want the vari 


MM $ ance of the sample. But if we 
E ot m estimate of the variance in the whole population, 
e sample is small, it is better to divide by one less 


than the number in the i 
: sample. A glimpse of the reason 
for this can be got by considering the case of the smallest 
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possible sample, namely, one man. Here the mean of the 
sample is the one height that we have measured, and the 
deviation of that measurement from the mean of the sample 
is zero. The formula if we divide by the number in the 
sample (one) will give zero for the variance—and that is 
correct for the sample. Butit would be too bold to estimate 
the variance of the whole population from one measurement: 
if we divide by one less than the sample we get variance 
— 0/0, that is, we don't know, which is a wiser state- 
ment.* 

More generally we can begin to understand the reason 
for dividing by (p — 1) instead of by p by the following 
considerations. 

The quantity we want to estimate is the mean square 
deviation of the measurements of the whole population, 
the deviations being taken from the mean of that whole 
population. We do not, however, know that true mean, 
and therefore in a sample we are reduced to using the mean 
of the sample, which except by a miracle will not exactly 
coincide with the true or population mean. The conse- 
quence is that the sum of the squares we obtain is smaller 
than it would have been had we known and used the true 
mean, For it is a property of a mean that the sum of the 
squares of deviations from it is smaller than of deviations 
from any other point. 


* It is important to remember that sampling the population is not 
the only source of error in the measurement of statisties, e.g. the 
correlation coefficient. All sorts of influences may disturb it. ‘These 
will usually “ attenuate " the correlation coefficient, ie. tend to 
bring it nearer to zero, as can be seen when we consider that a perfect 
correlation only can be reduced by error. But they will not always 
do so, and if the errors in the two trait measurements are themselves 
correlated, they may even increase the true correlations in a majority 
of cases. An estimate of the amount of variable error present can 
be made from the correlation of two measurements of the same 
trait on the same group, à correlation called the “ reliability," which 
Should be perfect if no variable errors are present. Spearman's cor- 
Tection for attenuation (see Brown and Thomson, 1925, 156) is based 
upon this. Like all estimates, the correction for attenuation is correct, 
€ven if the errors are uncorrelated, only on the average and not in 
ud instance, and it should never be used unless it is small. If it 

arge, the experiments are “ unreliable " and should be improved. 
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Consider for example the numbers 2, 3, and 7. Their 
mean is 4, and the sum of the squares about 4 is— 


(—32)*-- (— 2) + 8? = 14 


About any other point this sum will be greater than 14. 
About 5, for example, the sum is— 


(18) sa — 12) 4122 17 
About 2 the sum is— 


0* + 12 + 52 = 26 


It follows that the sum of the squares we obtained by 
using the sample mean was as small as possible, and in the 
immense majority of cases smaller than the sum about the 
true mean. It is to compensate for this that we divide 
by (p — 1) instead of by p. 

These elementary considerations do not of course indi- 
cate just why this procedure should, in the long run, ex- 
actly compensate for using the sample mean. Why not 
(p — 2), one might say, or (p — 3)? It is not possible, in 
an elementary account like the present, to answer this. 
Geometrical considerations, however, throw some further 
light on the problem. The p measurements of the sample 
may be thought of as existing in a certain space of (p — 1) 
dimensions. For example, two points define a line (of one 
dimension), three points define a plane (of two dimensions), 
and so on. The true mean of the whole population is not 
likely to be within that space, whereas the mean of the 
sample is. The deviations we have actually squared and 
summed are therefore in a space of one dimension less than 
the space containing the true mean. One“ degree of free- 
dom " has been lost by the fact that we have forced the 
lines we are squaring to exist in a space of (p — 1) di- 
mensions instead of permitting them to project into a 
P-space. Hence the division by (p — 1) instead of p. 
dieu ee ae farther. For each statistic which we 
Nue rae 3 sample itself and use in our subsequent 

, ose a “ degree of freedom." 


The standard error of a variance v, if the parent popula- 


SAMPLING ERROR AND TWO-FACTOR THEORY 39 


tion from which the samples are drawn is normally distri- 
buted, is estimated as— 
v4/2 


vp — X9 
where p is the number of persons in the sample. The 


standard error of a correlation coefficient 7 is, with the 
same condition, estimated as— 


qu 
Vip — 1) 

The use of this standard error, however, should be dis- 
continued (unless the sample is large and r small). 

Fisher (1925, page 202) has pointed out that the use of the 
formula for the standard error of a correlation coefficient 
is valid only when the number in the sample is large and 
when the true value of the correlation does not approach 
+1. For in small samples the distribution of r is not 
normal, and even in large samples it is far from normal 
for high correlations. The distribution of r for samples 
from a population where the correlation is zero differs 
markedly from that where the correlation is, say, 0:8. 
This means that the use of a standard error for testing 
the significance of correlation coefficients should, except 
under the above conditions, be discouraged. 

To get over the difficulty Fisher transforms 7 into a new 


variable z given by— 
z = Hlog(1 +7) — lor (1 — r)) 
=r+ it iret... 

It is not, however, necessary to use this formula, as com- 
plete tables have been published for converting values 
of r into the corresponding values of z. As r goes from — 1 
to + 1, z goes from — oo to +o, andr — 0 corresponds 
toz — 0. 

. The great advantage of using 2 as à variable instead of r 
is that the form of the distribution of 2 depends very little 
upon the value of the correlation in the population from 
Which samples are drawn. Though not strictly normal, it 
tends to normality rapidly as the size of the sample is 
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increased, and even for small samples the assumption 
of normality is adequate for all practical purposes. The 
standard deviation of z may in all cases be taken to be 
1/Vp — 8, where p is the number of persons in the sample. 

8. Error of a single tetrad-difference.—For our discussion 
of the influence of sampling on the factorial analysis of 
tests one of the most important quantities to know is the 
standard error of the tetrad-difference. "There has been 
much debate concerning the proper formula for this. (See 
Spearman and Holzinger, 1924, 1925, 1929; Pearson and 
Moul, 1927; Wishart, 1928 ; Pearson, Jeffery, and Elder- 
ton, 1929; Spearman, 1931.) That generally employed is 


formula (16) in the Appendix to Spearman’s The Abilities 
of Man: 


Standard error of 74372, — Taglia = 


2 [Spearman and 
vyra — M2 — faa + 7*) + (1 — 2r*)s*]! — Holzinger's 
v. formula (16).] 
where N is the number of persons in the sample, * 


ris the mean of the four correlation coefficients, and 
$? is their mean squared deviation (variance) from r. 


The probable error is -6745 times the above. A worked 
example will be found on page xii of Spearman's Appendix, 
using (which is all one can do) the observed values of the 7’s. 

It will be remembered that in Section 7 of Chapter I 
We stated Spearman's discovery in the form “ tetrad- 
differences tend to be zero." If tetrad-differences in the 
whole population, however, were all actually zero, they 
would not remain exactly zero in samples, and it is only 
samples that are available tous. We are faced, therefore, 


with a two-fold problem. (a) We have to decide, from the 
size of the tetrad-differences ET 


whether the sample is compat 


* We use p to mean the n i i 
W umber of persons in this book, but are 
ng N here and in “ formula 164 ” below to preserve the usual 
Ppearance of these well-known and much-used expressions, 
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tetrad-differences are not zero in the whole population, 
leaving a verdict of “not proven.” (See Emmett, 1936.) 
4. Distribution of a group of tetrad-differences.—The 
actual calculation, for every separate tetrad-difference, of 
its standard error by Spearman and Holzinger’s formula 
(16) is, however, an almost impossibly laborious task. In 
a table of correlations formed from m tests there are 
n(n — 1)/2 correlation coefficients, and n(n — l)(n — 2) 
(n — 8)/8 different (though not independent) tetrad- 
differences. Any one particular correlation-coefficient is 
concerned in (n — 2)(n — 3) different tetrad-differences, 
and any one test in (n — 1)(n — 2)(n — 3)/2 different 
tetrad-differences. Thus with ten tests there are 630 
tetrad-differences, and with twenty tests 14,535 tetrad- 
differences. In the latter case, any one test is concerned 
in 2,907. Under these circumstances, it is natural to look 
for a more wholesale method than that of calculating the 
standard error of each tetrad-difference. The method 
adopted by Spearman is to form a table of the distribution 
of the tetrad-differences, and compare this distribution 
with that of a normal curve centred at zero and with 
standard deviation given by— 
2 
VN 
where N = number of persons in the sample, 
r — the mean of all the 7’s in the whole table, 


? — their mean squared deviation from r. 


> 2 [Spearman and Hol- 
a ar) G — Rye zinger's formula (164).] 


Bre EX op. = p: and 
n— 2 n—2 
n = number of tests. 
Numerous examples of the comparison of “histograms " 
of tetrad-differences with normal curves whose standard 
deviation is found by (164) are given in Spearman’s The 
Abilities of Man. This method of establishing the hypo- 
thesis, that the tetrad-differences are derived by sampling 
Tom a population in which they are really zero, is open to 
€ same doubt as was explained in the simpler case of 
ne tetrad-difference. The comparison can prove that 
F.A—2* 
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the tetrad-differences observed are compatible with that 
hypothesis. It does not in itself prove that they are 
compatible with that hypothesis only ; and, as Emmett 
has shown in the article already mentioned, the odds are 
commonly rather against this. 

The usual practice, moreover, is to ** purify " the battery 
of tests until the actual distribution of tetrad-differences 
agrees with (164), so that in effect all that is then proved 
is that a team can be arrived at which can be described in 
terms of two factors. This, although a more. modest 
claim than has often been made, and certainly less than 
is implicitly understood by the average reader, is never- 
theless a matter of some importance. Not all teams of 
tests can be explained by one common factor; but it is 
not very difficult to find teams which can. There is little 
doubt in the minds of most workers that a tendency towards 
hierarchical order actually exists among mental tests. 

5. Spearman’s saturation formula.—tIt will be remem- 
bered from Section 4 of Chapter I that the calculation of 
the g saturation of each test forms an important part of 
the Spearman process. We saw there that in a hierarchical 
matrix each correlation is the product of the two g satura- 
tions of the tests, for example— 


Taa = Tj + Tag 
Since this is so, each g saturation can be calculated 


from the correlations of a test with two others, and their 


inter-correlation. Thus to find 7, we can take Tests 2 and 
3 as reference tests, when we have— 


Tais Tw, 


T23 Tog + Tag 
When the matrix’ is really hierarchical, and there are 
no sampling errors present, it is immaterial which two tests 


we associate with Test 1 in order to find its g saturation. 
We have, in fact, in that case— 


jiu cd " x 
12 + "13 E/228:075 fa Ss 

Tos Tas T25 

But even if the correlations, 
population, were really exactly 


= ete. 


measured in the whole 
hierarchical, sampling 
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errors would make these fractions differ somewhat from 
one another, and we are faced with the problem of deciding 
which value to accept for the g saturation. The average 
of all possible fractions like the above would be one very 
plausible quantity to take but is laborious to compute. 
Spearman therefore adopts a fraction— 


Tye - Tis + Tia - Tis t M2 + Tis + etc. ai 
s. 0 aes eee age LA dy 
Tas E Mass six a, oek: 


whose numerator is the sum of the numerators, and whose 
denominator is the sum of the denominators, of the single 
fractions. This combined fraction he computes in a 
tabular manner which we will next describe, by the 
algebraieally equivalent formula— 


T 2 — As! [Spearman's formula (21), 
"QUU m 24, Appendix, Abilities of Man.] 


s 
The quantities A,, 4s, etc., are the sums of the rows (or 
columns) of the matrix of correlations without any entries 


in the diagonal cells. (‘The arithmetical example is con- 
fined to five tests to economize space) : 


1 2 3 4 5 A A? 
aM Gee uae n 24 1-41 1-988 
2 “50 : 56 32 45 | 158 2341 
8 | 34 56 19 85 | 188 1-904 
4 | 39 32 13 s 39 | 107 1145 
5 | 294 45 85 29 | 103; uta 
p iun T = 042 


T is the sum of all the 4’s, and therefore of all the 
correlations in the table (where each occurs twice). A 
new table is now written out, with each coefficient squared, 
and its rows summed to obtain the quantities A’: 


1 2 3 4 5 A’ 
TK M +250 1 al6 09  -058 -533 
2 +250 3 314 102 -023 -689 
3 116-314 : 017 123 570 
4 109  -102 -017 n -084 312 
5 


058 -023 123 -084 . :288 
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The calculation of all the saturations is then best per- 
formed in a tabular manner, thus : 


| 


Me wae! A 


A? | A’ 4? — A'| 24 T-2A — ' Satu 
| | T-24 |ation 
| 

1 1-988 -533 | 1:455 2-82 3:60 -4042 -66 
2 2-341 “689 | 1-652 3:06 | 3:36 4917 “70. 
3 1-904 570 | 13834 | 2-76 | 3:66 :3645 *60 
4 1:145 312 | -833 | 234 | 4-28 :1946 "44 
5 1:661 :288 USES | 2-06 | 4-36 1773 42 


where the last column is the square root of the preceding. 
The reader should caleulate the six different values of 
Ty from the original table by the formula (ri + Tu/ Tg) 


for comparison with the value -66 obtained above. He 
will find— 


"55 "12 :89 
:98 48 


; :52 
with an average of -68. 


6. Residues.—If the correlations which would arise from 
these saturations or loadings are calculated, and subtracted 
from the observed correlations, we obtain the residues 
which have then to be examined to see if they are small 
enough to be attributable to sampling error. In the 
following double table of correlations are set out the ob- 
served correlations uppermost, and those calculated from 


the g saturations below. The difference is the residue, 
which may be plus or minus : 


g Loadings | -66 “70 60 E 42 
66 s :50 34 33 24 
46 "40 :29 :28 
“70 50 . -56 32 15 
46 42 31 29 
60 94 56 5 13 35 
40 42 :26 :25 
44. :38 :82 AS . 29 
+29 E :26 18 
"42 24 15 35 29 


"28 29 "25 18 
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The lower numbers are the products of the two 
saturations. In this case the residues range from — -14 
to ---14 and at first sight appear in many cases to be 
too large to be neglected in comparison with the original 
correlations. 

To check this impression, consider the correlation -56 
and the value -42 from which it is supposed to depart only 
by sampling error, a deviation of -14. Fisher's z corres- 
ponding to r = -42 is 45, and that corresponding to 7 = 
-56 is z = ‘68, so that the z deviation is -18. The standard 
deviation of z for 50 cases is 1 + 4/47 = 15. The devia- 
tion is little larger than one standard deviation and cannot 
therefore be called significant. But as the reader will ob- 
serve, this conclusion is due more to the large size of the 
standard error than to the small size of the residue. The 
residue is here attributable to sampling error, because the 
latter is so large. But because the latter is large it does not 
follow that the large residue is certainly due to it. 

7. Reference values for detecting specific correlation.—1f 
after a calculation like that described, one of the residues 
is found to be too large to be explicable by sampling error, 
the excess of correlation over that due to g is attributed to 
“ specific correlation," meaning correlation due to a part 
of their specific factors being not really unique but shared 
by these two tests. In the case of our numerical example, 
if the number of subjects tested had been larger, the standard 
errors of the coefficients would have been smaller, and some 
of the discrepancies between the experimental values and 
those calculated from the g saturations would have been 
too large to be overlooked, but would have had to be 
attributed to specific correlation. In such a case, the g 
loadings would, of course, be wrong and would have to be 
recaleulated from the battery after one of the tests con- 
cerned in the specific correlation was removed from it. 
Later, the other test could be replaced in the battery 
instead of the first, and thus its g saturation found. The 
difference between the experimental correlation of the 
two, and the product of their g saturations, with a standard 
error dependent on the size of the sample, would be then 
attributed to their specific linkage. 
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If two tests, v and w, are thus suspected of having a 
specific link as well as that due to g, it is clear that the 
smallest battery of tests which could be used in the above 
manner to detect that link would be one of two other tests, 
wand y, say, to make up à tetrad : 


v av 
w 1 vw Tr 10 
y Tey Tay 


and these two “ reference " tests would have to be known 
to have no specific links with each other or with the two 
suspected tests. The example which gave rise to Figure 5 
(see Chapter I, page 15) illustrates this. Tests 2 and 3 
there are, let us suppose, those with a suspected specific 
link. The tetrad-difference to be examined by means of 
Spearman’s formula (16) is that which has T23 AS ONC corner, 
In such a case, where the two reference tests 1 and 4 are 
known to have no link except g with one another, or with 
the other two tests, two of the possible tetrad-differences 
ought to be larger than three times the standard error 
given by formula (16), and equal to one another, while the 
third tetrad-difference should be zero ( 
to zero, in practice) (Kelley, 1928, 67). 

The g saturation of each of the tests under ex 
for specific correlation ean be found 
the two reference tests. 


or sufficiently near 


amination 
by grouping it with 
` Thus in the case of our Figure 5, 


we have— 
7 D :5 x5 
oe Su des 
Tia 5 
T, T. x5 
Ta =% RT O ae 
Tia 5 
Therefore the correlation between 2 and 3 which is due 
to g is— 


Tag + Tay — V5 X 4/-5 = +5 
and the difference between this and 


is the part to be explained by the speci 
these two tests. 
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When there are several reference tests available, all 
believed to have no link except g with one another or with 
the two tests suspected of specific overlap, there will be 
a number of ways of picking two of them to obtain the 
tetrad required to decide the matter, and the results will, 
because of sampling and other errors, be discrepant. Under 
these circumstances Spearman has devised an interesting 


amalgamating the results into one. A 


procedure for 
age xxii of the 


numerical example is given by him on p 
Appendix to The Abilities of Man. 


CHAPTER IV 
THE DEFINITION OF g 


1. Any three tests define a “ g.”—The idea of g arose out of 
Professor Spearman’s acute observation that correlation 
coefficients between tests tend to show hierarchical order : 
that is, that their tetrad-differences tend to be zero or small; 
or in more technical terms still, that the rank to which a 
matrix of correlation coefficients can be “ reduced ” by 
suitable diagonal elements tends towards rank one. This 
fundamental fact is at the basis of all those methods of 
factorial analysis which magnify specific factors. In con- 
sequence, correlation coefficients between a number of vari- 
ables can be adequately accounted for by a few common 
factors. To be adequately described by one only—a g— 
the “ reduced ” rank of the correlation matrix has to be 
one, within the limits of sampling error. 

Suppose now that we have three tests and have, in the 
whole population, measured their correlation coefficients, 
If, as is usually the case, these coefficients are all positive, 
and if each of them is at least as large as the product of the 
other two, we can explain them by assuming one g and 
three specifics s,, $» and sz. There are many other ways 
of explaining them, but let us adopt this one. We have 
thereby defined a factor g mathematically (Thomson, 1935a, 
260). It is then for the psychologist to say, from a 
consideration of the three tests which define it, what name 
this factor shall bear and what its 


is. The psychologist may think, after studying the tests, 


h Let € that at any rate he does 
not reject the possibility, but that he would like an oppor- 


tunity of studying other tests which (mathematically 
speaking) contain this factor, and have nothing else in 
common, before finally deciding. 


48 
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In that case the experimenter must search for a fourth 
test which, when added to these three, gives tetrad- 
differences which are zero ; and then for a fifth and further 
tests, each of which makes zero tetrad-differences with the 
tests of the pre-existing battery. This extended battery 
the experimenter would lay before the psychological judge, 
to obtain a ruling whether the single common factor, of 
which it is the now extended but otherwise unaltered 
definition, is worthy of being named as a psychological 
factor. 

2. The extended or purified hierarchical battery.—Mathe- 
matically, any three tests with which the experimenter 
cared to begin would define “ a ” g, if we except temporarily 
the case, to which we shall later return, of three correlation 
coefficients, one of which is less than the product of the 
other two. The experimental tester, however, might in 
some cases have great difficulty in finding further tests, to 
add to the original three, which would give zero tetrad- 
differences. Unless he could do so, it is unlikely that the 
psychological judge would accept the factor as worthy of 
à name and separate existence in his thoughts. It is, for 
example, an experimental fact that starting with three 
tests which a general consensus of psychological opinion 
would admit to have only * intelligence ” as a common 
requirement, it has proved possible to extend the battery 
to comprise about a score of tests without giving any 
tetrad-differences which cannot be regarded as zero.* 
.Even that has not been accomplished without difficulty, 
and without certain blemishes in the hierarchy having to be 
removed by mathematical treatment. But the fact that 


i à t ded 
with t ions i ossible, and that psychologica 
ead ch test of this battery 


judgment endorses the opinion that ea I 
requires “ intelligence," is the main evidence behind the 
actual “ existence ” of such a factor as 
* The process of making such a battery of tests t de 
intelligence (see Brown and Stephenson, 1933) has pa n = a 
the form of choosing three tests aS the basal eee doe E 
€xtending the battery. Instead, à number of oe m Y die 
thought from previous experience, would act in the un bee 
been taken, and the battery thus formed has i een p! 
the removal of any tests which broke the hierarchy- 


cones general intelli- 
to define general 
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gence.” It must be noted that the word “ existence ” 
here does not mean that any physical entity exists which 
can be identified with this g. It does mean, however, that, 
as far as the experimental evidence goes, there is some 
aspect of the causal background which acts “as if" it 
were a single unitary factor in these tests. 

The important point to note is that the experimenter has 
produced a battery of tests which is, he claims, hierarchical; 
that the mathematician assures him that such a battery 
acts “ as if ” it had only one factor in common (though it 
can also be explained in many other ways), and that the 
psychologist agrees that psychologically the existence of 
such a factor as the sole link in this battery seems a reason- 
able hypothesis. 

3. Different hierarchies with two tests in common.—Now, 
it must be remembered that, starting with three other 
tests, which may contain two of the former set, it may 
very well be possible to build up a different hierarchy. 
Only experiment could show whether this were possible in 
each case, there is no mathematical difficulty in the way. 
Such a hierarchy would also define “a” g, but this would 
be usually a different factor from the former g. If there 
were three tests common to the two hierarchies, then the 
two g’s could be identified with one another (sampling 
errors apart), and the three tests would be found to have 
the same saturations with the one gas with the other. But 
if only two tests were common to the two batteries this 
would not in general be the case, and the different satura- 
tions of these tests with the two £'s would show that the 
latter were different (Thomson, 1935a, 261-2). Under 
such circumstances the psychologist has to choose. He 
cannot have both these g’s. Both are mathematically of 
equal standing, it is a psychological decision which has to 
be made. When one g is accepted, the other, as a factor, 
must then be rejected and a more complicated factorial 
analysis of the second hierarchy has to be built up which 
is consistent with this. 

4. A test measuring “ pure g.”—Although the hierarchical 
battery defines a g it does not enable it to be measured 
exactly (but only to be estimated) unless either it contains 
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an infinite number of tests, or a test can be found which 
conforms to the hierarchy and has a g saturation of unity.* 
In the latter case this test which is ** pure g " is such that 
when it is considered along with any other two tests of its 
hierarchy, its correlations with them, multiplied together, 
give the intercorrelation of those two with one another : 
if k is the ** pure ” test, then— 


Tali — Tij 


its g saturation being— 


n Tigger 
Tij 


No such “ pure ” test of the g which is defined by the 
Brown-Stephenson hierarchy of nineteen tests has yet been 
found. Such a pure test, with full g saturation, must not 
be confused with tests which are sometimes called tests of 
pure g because they do not contain certain other factors, 
in particular the verbal factor. Thus the “S.V.P.” 
(Spearman Visual Perception) tests are referred to by 
Dr. Alexander (1935, 48) as a “pure measure of g”; but 
their saturations with g are given by him (page 107) as 
"787, 701, and -736 respectively, so that in each case only 
about half the variance is “ g " and half is a specific. 

5. The Heywood case.— Consider the case where three 


tests are such that 


Tal > Tij 

In such a case the g saturation of the test k, if we calcu- 
late it, is greater than unity, which is impossible. Yet it 
is possible, in theory at least, to add tests to such a triplet 
to form an extended hierarchy with zero tetrad-differences. 
There can be one such case (but only one) in a hierarchy. 
We shall call them Heywood cases, as this possibility was 
first pointed out by him (Heywood, 1931). Asan artificial 
example, consider these correlations : 

* It is understood, of course, that even such a test would give 
different measures of a man’s g from day to day, if the man’s per- 
formance in it varied (as it undoubtedly would) from day to day. 
By measuring with exactness is meant, in this part of the text, 
measurement free from the uncertainty due to the factors out- 
numbering the tests. We are assuming sampling errors to be nil. 
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1 2 3 4 5 


1:000 :945 840 “735 630 
:945 — 1-000 “720 630 540 
:840 720 — 1-000 -560 "480 
"785 -630 -560 1-000 420 
-630 -540 *480 420 — 1-000 


OUR OS d 


This is a perfect hierarchy, every tetrad-difference being 
exactly zero. It is, moreover, a perfectly possible set of 
correlations, and passes the tests required for a matrix of 
correlations to be possible. For example, the determinant 
of the matrix is positive. But when we calculate the g 
saturations of the tests we find them to be: 


Test | 1 2 3 4 5 


g saturation | 1:05 9 8 i 6 


so that a single general factor is an impossible explanation 
of this hierarchy as far as Test 1 is concerned. The 
correlations of Test 1 with the other tests are possible, and 
they give exactly zero tetrad-differences : but yet the test 
cannot be a “ two-factor ” test, for the correlations of the 
first row are too high to be explained in that way. "The 
rule governing its possible existence has been given by 
Ledermann, namely, that the g saturation of the Heywood 


case cannot exceed— 
i +S 
S 


where S is the quantity familiar from Spearman’s formula— 


2 
Tig 


DEA 


dT 
for the remainder of the hierarchy (i = 2, iam eens) ati 
then, a Heywood test can be found to conform to a hier- 
archy, it seems likely that the g defined by that hierarchy 
must be abandoned. The secker for a test for pure g is 
thus in a delicate position. He wants to find a test with 
full saturation of unity. But he must just hit the mark. 
If the saturation exceeds unity, his whole hierarehy must 
be abandoned as a definition. And even when the exact 
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saturation of unity has been found, there seems to be too 
narrow a line dividing the perfect from the impossible, and 
the reality of the g seems to be balanced on a knife edge. 
In actual practice, of course, sampling errors would make 
the situation less acute and could for some time be called 
in to explain a certain amount of excess saturation over 
unity. 

6. Hierarchical order when tests equal persons in number.— 
If a test cannot be found whose saturation with g is unity 
(“pure g”), the other method of measuring g exactly 
would seem to be to extend the hierarchy until it comprised 
so many tests that the multiple correlation with g— 


= si 
= FUIS 


became practically unity. For S increases with the number 
of tests, being the sum of the positive quantities— 
Tig? 
I- Ty 

e theoretical interest, namely, 
e increased the number of 
s numerous as the persons 
in view of the difficulty of 
is admittedly not a 
but its theoretical 


There is here a point of som 
what happens when we hav 
hierarchical tests until they are a 
to whom they are given ? This, 
finding tests to add to a hierarchy, 
question likely to trouble experimenters, 
implications are considerable. 

It can be shown that whenever we have a matrix of 
correlations based upon the same number of tests as 
persons, its determinant is zero. Now the determinant of 
a hierarchical matrix (with unity in each diagonal cell) 
can be shown to be of the form— 

(1 —n?)1-— aee xS fal) + + 
+ Beh Tag) = Taod Tg) 

+ (1 — ry’) Cpe (a gps Mann) tas ike 
=S (1 E Ty (1 a Ta’) Ty" (1 E Tag’) Ont 
desee mr mU Ty) Tay 

+... 
and it is clear th 
unless we have a case of pure £, 


at each of these quantities is positive 
or a Heywood case. A 
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case of pure g will leave one of the rows of the above sum 
non-zero. To make the whole sum zero, one case must be 
a Heywood case, giving— 


1 — r° negative. 


It would seem, therefore, that by the time we have 
added hierarchical tests to make them equal in number to 
the persons, we will necessarily have added a Heywood 
hierarchical case (of which there can be only one in a 
hierarchy). But we have agreed that the discovery of a 
Heywood case will cause us to abandon the hierarchy as 
a definition of g ! 

The ease where the number of tests is increased to equal 
the number of persons may seem to the reader to be an 
academie case only. But the case of reducing the number 
of persons until they equal the number of tests is one which 
could easily be realized in practice, and presents equal 
theoretical difficulties. This draws attention to the 
dependence of any definition of factors on the sample of 
persons tested. If we have a perfect hierarchy of, say, 
50 tests, in a population of, say, 1,000 persons, a sample of 
fifty persons from the above thousand, if it gives hier- 
archical order, will give a Heywood case, and its g will be 
impossible. 

If the g corresponding to the original analysis on the 
thousand persons were anything real, such as a given 
quantity of mental energy available in each person, then 
it ought always to be possible, one might erroneously 
think, to find fifty persons and fifty tests to givea hierarchy, 
without a Heywood case. But that cannot be easily said. 
It is impossible, from the correlations alone, to distinguish 
a real g from one imitated by a fortuitous coincidence of 
specifics. Even if g were a reality, a sample of persons 
equal in number to the tests could not give a hierarchy 
without a Heywood case, and their apparent g would be 
fortuitous. 

Now the case of a test of pure g is on the border line of 
the Heywood cases. It is clear then that it will be suspect, 
as being probably only fortuitous, if the number of persons 
does not far exceed the number of tests. 
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7. Singly conforming tests —There remains one other 
conceivable method of measuring g exactly,* by the use 
of certain tests which, when they are all present, destroy 
the hierarchy, although any one of them can enter the 
battery without marring it—“ singly conforming " tests 
(Thomson, 19345; and 1935a, 253-6). It will be shown 
in later chapters: on factor estimation that the reason 
factors cannot be measured exactly, but have to be esti- 
mated only, is that they outnumber the tests. Every 
new test which conforms to a hierarchy adds a new specific 
(unless it is pure g), and thus continues the excess of factors 
over tests. It can occur, however, that the correlation of 
two tests with each other breaks a hierarchy, although 
either of them alone conforms otherwise. Such a case 
occurs in the Brown-Stephenson battery, for example, one 
of whose correlation coefficients has to be suppressed before 
the hierarchy is acceptable. 

In such a case, if the psychologist is prepared to accept 
either test as a member of the battery, the erring correlation 
coefficient must be due to these two tests sharing some 
portion of their specifies with one another. If, as may 
happen (apart from error which we are supposing absent), 
their intercorrelation shows that they have only one specific 
factor between them, and differ only in their saturations, 
then they enable the estimate of g to be turned into accurate 
measurement. For example, consider the following matrix 


of correlations : 


1 2 3 4 5 6 
1 . :669 +592 458 335 :251 
2 -669 . -566 438 “870 240 
3 +592 566 . +387 +283 212 
4 458 438 "387 . 219 “164 
5 1335 870 283 +219 5 120 
6 +251 :240 :212 164 120 


This is a perfect hierarchy except for the correlation— 
fgg = :870 


ant, with the same exactness as the test 


* By “exactly " is me 
l] indeterminacy due to an excess of 


scores, without the additiona 
factors over tests. 
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Every tetrad-difference, which does not contain this 
correlation, is zero. If either Test 2 or Test 5 is removed 
from the battery, there remains a perfect hierarchy. If 
Test 5 is removed, we can caleulate from the remaining 
battery the g saturations : * 


Test 1 2 Se d 6 


g saturation -8387 -800 "70T -548 -800 


If we remove Test 2 and restore Test 5, we get the fol- 
lowing : 


Test 1 8 4 5 6 


g saturation :887 "TOT 548 -400 +300 


From either hierarchy we can estimate g. The correla- 
tion of our estimates with “ true g” will be— 


a S 
Sea 
where SE n eee OES ; 
1 — saturation? 

and we find for the two hierarchies the g correlations of 
:92 and -90. 

From the two Tests 2 and 5 alone, however, we can ob- 
tain a g correlation of unity. 

The reason for this is that the correlation of Tests 9. 
and 5 is such as to show that their specifies are identical, 


the two tests differing only in their loadings. Their 
equations are— 


= 82+ V — -83)s, 
Ege T EVADIT 74*)sg 
If the whole of s, is identical with the whole of sẹ, their 
intercorrelation should be— 
',8x-4-v( — "8:1 — 43) = -870 
and this is its experimental value. 


We could, therefore, have seen at the beginning, if we 
had tested the above fact, that these two tests would make 


THE DEFINITION OF g 57 


a perfect battery for measuring g. We have the simul- 
taneous equations— 

Za = “8g + -6s 

zs = -4g + 917s 
from which we can eliminate s. 

We see, therefore, that under certain hypothetical 
circumstances, a more exact estimate of g can be obtained 
from two of these “singly conforming” tests than the 
hierarchy with which they conform individually. Those 
circumstances are, that their correlation with one another 
(the correlation which breaks the hierarchy because it is 
too large) should either equal— 


or should approach this value. 

It cannot in actual practice be expected to equal it, as 
in our artificial example. For we have disregarded errors, 
which are sure in some measure to be present. At what 
stage will the pair of singly conforming tests cease to be 
a better measure of g than the better of the two hierarchies 
made by deleting either the one or the other? If in our 
example the correlation -870 of Tests 2 and 5 be imagined 
to sink little by little, the correlation of their estimate 
with g will sink from unity. The better of the two hier- 
archies gives a multiple correlation of -922. When the 
correlation rg, has sunk from :870 to -847, these two singly 
conforming tests will give the same multiple correlation, 
:922. If this defect from the full -870 is due entirely to 
error, then a fall to -847 corresponds to reliabilities of the 
two tests of the order of magnitude of -98, if they are 
equally reliable. This is a very high reliability, seldom 
attained, so that in a case like our example quite a small 
admixture of error would make the singly conforming 
tests no better at estimating g than the hierarchy. We 
are here, however, neglecting the fact that error would also 
diminish the efficiency of the hierarchy. Nevertheless, the- 
chance of finding a pair of singly conforming tests, highly 
reliable, and having no specifics except that which they 
Share, seems small, as small as the chance of finding a test 


of pure g, perhaps. It might possibly turn out, however, 
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that a matrix of several (say 1) singly conforming tests 
would be practicable. Such a set would measure g exactly 
if among them they added only t — 1 new specifies to the 
hierarchy. Their saturations would be found by placing 
them one at a time in the hierarchy, and then their regres- 
sion on g caleulated by Aitken's method (see Chapter XIV). 
The necessity for the hierarchy in the background, in all 
this, is clear: it is there to assure us that each singly con- 
forming test is compatible with the definition of g, and to 
enable its g saturation to be calculated. 

8. The danger of “ reifying " factors —The orthodox view 
of psychologists trained in the Spearman school is that g is, 
of all the factors of the mind, the most ubiquitous. ** All 
abilities involve more or less g," Spearman said, although 
in some the other factors are “ so preponderant that, for 
most purposes, the g factor can be neglected.” With 
this view, the present author has always agreed, provided 
that g is interpreted as a mathematical entity only, and 
judgment is suspended as to whether it is anything more 
than that. 

The suggestion, however, that g is “ mental energy,” of 
which there is only a limited amount available, but avail- 
able in any direction, and that the other factors are the 
neural machines, is one to be considered with caution. 
The word energy has a definite physical meaning. *' Mental 
energy " may convey the meaning that the energy spoken 
of is the same as physical energy, though devoted to mental 
uses. Ifthat meaning is accepted, innumerable difficulties 
follow, not the least being the insoluble questions of the 
connexion of body and mind, and of freewill versus 
determinism. A less obscure difficulty is that there seems 
to be no easily conceivable way in which the “ energy ”, 
of the whole brain can be used in any direction indifferently, 
except by the “ neural engines " also all taking part. The 
energy of a neurone seems to reside in it, and the passage 
of a nerve impulse along a neurone seems to resemble 
rather the burning of a very rapid fuse, than the conduction 
of electricity, say, by a wire. 

If “ mental energy ” does not mean physical energy at 
all, but is only a term coined by analogy to indieate that 
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the mental phenomena take place “ as if " there were such 
a thing as mental energy, these objections largely disappear. 
Even in physical or biological science, the things which are 
discussed and which appear to have a very real existence 
to the scientist, such as ** energy,” “ electron,” * neutron,” 
“ gene,” are recognized by the really capable experimenter 
as being only manners of speech, easy ways of putting into 
comparatively concrete terms what are really very abstract 
ideas. With the bulk of those studying science there exists 
always the danger that this may be taken too literally, but 
this danger does not justify us in ceasing to use such terms. 
In the same way, if terms like “ mental energy ” prove to 
be useful, and can be kept in their proper place, they may 
be justified by their utility. The danger of “ reifying ” 
such terms, or such factors as g, v, etc., is, however, very 
great. 


PART II 
MULTIPLE-FACTOR ANALYSIS 


CHAPTER V wu 


THE CENTROID METHOD 


l. Need of group faclors.—The two-factor method of 
analysis, described in an earlier chapter, began with the idea 
that a matrix of correlations would ordinarily show perfect 
hierarchical order if care was taken to avoid tests which 
were “ unduly similar," ie. very similar indeed to one 
another: If such were found coexisting in the team of 
tests, the team had to be “ purified " by the rejection of 
one or other of the two. Later it became clear that this 
process involves the experimenter in great difficulty, for it 
subjects him to the temptation to discover “ undue simi- 
larity " between tests after he has found that their correla- 
tion breaks the hierarchy. Moreover, whole groups of 
tests were found to fail to conform ; and so group factors 
were admitted, though always, by the experimenter trained 
in that school, with reluctance and in as small a number as 
possible. It had, however, become quite clear that the 
Theory of Two Factors in its original form had been super- 
seded by a theory of many factors, although the method 
of two factors remained as an analytical device for 
indicating their presence and for isolating them in com- 
parative purity. 

Under these circumstances it is not surprising that some 
workers turned their attention to the possibility of a method 
of multiple-factor analysis, by which any matrix of test 
correlations could be analysed direct into its factors 
(Garnett, 1919a and b). It was Professor Thurstone of 
Chicago who saw that one solution to this problem could 
be reached by a generalization of Spearman’s idea of zero 
tetrad-differences. 

2. Rank of a matrix and number of fa 
when all the tetrad differences are Ze 
can all be explained by one general factor, 

63 


ctors.—We saw that 
ro, the correlations 
a tetrad being 
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formed of the intercorrelations of two tests with two other 
tests, thus : 


| 3 4 
1 | ms Tja 
2 | Ta Tog 


and the tetrad-difference being— 


TaaTo4 — T2334 
Thurstone’s idea, though rather differently expressed by 
him, can be based on a second, third, fourth . . . calcu- 
lation of certain tetrad-differences of tetrad-differences. 
To explain this, let us consider the correlation co- 
efficients which three tests make with three others : 


4 5 6 
1 Tis Tis Tig 
2 Tos Tos _ Tog 
8 T34 T35 T36 


This arrangement of nine correlation coefficients might 
have been called a “ nonad,” by analogy with the tetrad. 
Actually, by mathematicians, it is called a “ minor deter- 
minant of order three” or more briefly a three-rowed 
minor ; a tetrad is in this nomenclature a “ minor of order 
two.” 

We can now, on the above three-rowed determinant, 
perform the following calculation. Choose the top left 
coefficient as “pivot,” and calculate the four tetrad- 
differences of which it forms part, namely : 


(raras — asta) ("14725 — T2476) 
("14735 — T3475) ("14736 — 3471) 

These four tetrad-differenees now themselves form a 
tetrad which can be evaluated. If it is zero, we say that 
the three-rowed determinant with which we started 
“vanishes.” — — n 

‘Exactly the same repeated process can be carried on with 
larger minor determinants. For example, the minor of 
order four here shown vanishes : Á 
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(-26) 32 38 34 
E» :36 :62 "12 
E :62 -66 E 
AS 58 63 -60 
for its pivotal (— -0408) -0016 -0444 
t.d.'s are :0204. -0044 —-0300 
-0068 — :0072 -0030 
and then (— :00021216) -00031824 


-00028288 — -00042432 


and finally Zero 


This process of continually calculating tetrads is called 
* pivotal condensation." The reader should be given a 
word of warning here, that the end-result of this form of 
calculation, if not zero, has to be divided by the product of. 
certain powers of the pivots, to give the value of the deter- 
minant we began with. A routine method (Aitken, 1937a) |! 
of carrying out pivotal condensation, including division | 
by the pivot at each step, is described in Chapter XIV, I 
pages 201ff.* 

We can in this way examine the minors of orders two, 
three, four (and so on) of a correlation matrix, always 
avoiding those diagonal cells which correspond to the 
correlation of a test with itself. We may come to a point 
at which all the minors of that order vanish. Suppose these 
minors which all vanish are the minors of order five. We 
then say that the “rank ” of the correlation matrix is four 
(with the exception of the diagonal cells). There then 
exists the possibility that the “ rank " of the whole corre- 
lation matrix can be reduced to four by inserting suitable 
quantities in the diagonal cells (see next section). The 
* rank ” of a matrix is the order of its largest} non-vanish- 


vé 


* If the process gives, at an earlier stage than the end, a matrix 
the rank of the original determinant is 


entirely composed of zeros, pa 
correspondingly less, being equal to the number of condensations 


needed to give zeros. . 
+ “ Largest ” refers to 
value. 


the number of rows, not to the numerical 


T.A.—3 
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ing minor. The tests can then be analysed into as many 
common factors as the above reduced rank of their corre- 
lation matrix—the rank, that is to say, apart from the diag- 
onal cells—plus a specific in each test. 

3. Thurstone’s method used on a hierarchy.—Thurstone’s 
rule about the rank includes Spearman’s hierarchy as a 
special case, for in a hierarchy the tetrads—that is, the 
minors of order two—vanish. The rank is therefore one, 
and a hierarchical set of tests can be analysed into one 
common factor plus a specific in each. A simple way of 
introducing the reader to Thurstone’s hypothesis and also 
to his ** centroid ” method* of finding a set of factor satura- 
tions will be to use it first of all on the perfect Spearman 
hierarchy which we cited as an artificial example in our 
first chapter. 


Tests 1 2 3 4 5 6 
1 c 12 63 54 A5 36 
2 72 . 56 48 40 32 
3 63 56 . 42 "35 28 
4 54 48 A2 * :30 24. 
5 45 40 35 30 . 20 
6 36 :32 :28 “24 20 


The first step in Thurstone’s method, after the rank has 
been found, is to place in the blank diagonal cells numbers 
which will cause these cells also to partake of the same rank 
as the rest of the matrix, numbers which, for a reason which 
will become clear later, are called ‘‘ communalities." In 
our present Spearman example that rank is one, i.e. the 
tetrads vanish. The communalities, therefore, must be 
such numbers as will make also those tetrads vanish which 
include a diagonal cell: this enables them to be calculated. 
Let us, for example, fix our attention on the communality 

. of the first test, which we will designate h,? (the reason for 
the "square". will become apparent later) Then the 
tetrad formed by "Fests-1 and 2 with Tests 1 and 8 is: 


* We shall see why it is called the “ centroid’? method in the 
next chapter. 


$ 


F 


pe Tin 
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1 | haè -68 
2 "12 -56 
and the tetrad-difference has to vanish. Therefore— 
‘56h,? — 72 x 63 = 0 
~z. h? = -81 

Similarly all the communalities can be calculated, and 

found to be— 

81 64 “49 :36 :25 16 
(The observant reader will notice that they are the squares 
of the ** saturations " of our first chapter; but let us con- 
tinue as though we had not noticed this.) 

The method of finding the saturations of each test with | 
the first common factor is then to insert the communalities | 
in the diagonal cells and add up the columns* of the 
matrix, thus : 


Original Correlation Matrix 


(-81) “72 -63 -54 ET “36 
72 (-64) -56 -48 -40 :32 
68 -56 (49) 42 -35 +28 
“BA, 48 42 (-36) “30 :24 
“45 “40 “35 -80 (-25) +20 
+36 “82 28 ET -20 (16) 


3:51 3:312 2-73 2:34 1:95 1:56 = 15321 


The column totals are then themselves added together 
(15:21) and the square root taken (3-90). The “ satura- 
tions" of the first (and here the only) common factor 
are then the columnar totals divided by this square root, 
namely— 

351 812 278 2:34 195 1:56 
3-900 390 390 390 390 3-90 
or 9 8 7 6 -5 E 


* This, the “ centroid *' method of finding a set of loadings, is not in 
any way bound u up with Thurstone's theorem about the rank and 
the number of common 1 factors. It can be used, for example, with 


unity in in each diagonal cell, in which case it will give as many common 
factors as there : are tests, and no o specific 1 factors, 


iT 
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as in the present instance we already know them to be. 
(Very often in multiple-factor analysis the “ saturation " 
of a test with a factor is called the “ loading," and this is 
a convenient place to introduce the new term.) 

As applied to the hierarchical ease, this method of 
finding the saturations or loadings had been devised and 
employed many years previously by Cyril Burt, though it 
is not quite clear how he would have filled in the blank 
diagonal cells (Burt, 1917, 58, footnote, and 1940, 448, 462). 
It should be explained that in actual practice Thurstone 
and his followers do not calculate the minor determinants 
to find the rank and the communality, for that would be 
toolaborious. Instead they adopt various approximations, 
of which the simplest is to insert in each diagonal cell the 
largest correlation coefficient of the column (see Section 
10). 

4. The second stage of the ** centroid”? method.—1f there is 
more than one common factor, the process goes on to 
another stage. Even with our example we can show the 
beginning of this second stage, which consists in forming 
that matrix of correlations which the first factor alone 
would produce. This is done by writing the loadings 
along the two sides of a chequer board and filling every cell 
of the chequer board with the product of the loading of 
that row with the loading of that column, thus : 


First-factor Matrix 


| 9 8 7 6 5 4 
9| 81 72 63 -54 45 -86 
8 | "72 “64 -56 48 40 -32 
7 | -63 -56 -49 “42 “85 28 
6 "54 -48 42 86 -30 +24, 
5 E - -40 -35 30 . -25 :20 
E 36 32 :28 +24, +20 16 


This is the “ first-factor matrix ” which gives the parts of 
the correlations due to the first factor. This matrix has now 
to be subtracted from the original matrix to find the resi- 
dues which must be explained by further common factors. 

In our present example the first-factor matrix is identical 
with the original matrix and the residues are all zero. Only 


S 
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the one common factor is therefore required. (Of course, 
the reader will understand that in a real experimental 
matrix the residues can never be expected to be exactly 
zero: one is content when they are near enough to zero to 
be due to chance experimental error. Had the rank of 
our original matrix of correlations been, however, higher 
than one, there would have been a matrix of residues. 

Let us now make an artificial example with a larger 
number of common factors, say three, which we can after- 
wards use to illustrate the further stages of Thurstone's 
method. We can do this in an illuminating manner by 
the aid of the oval diagrams described in Chapter I. ` 

5. A three-factor evample.—1In Figure 10, a diagram of the 
overlapping variances of four tests, let us insert three 
common factors and specifies to 
complete the variance of each test 
to 10 (to make our arithmetical 
work easy) No factor here is 
common to all the four tests. The 
factor with a variance of 4 runs 
through Tests 1, 2, and 3. That 
with a variance 3 runs through 
Tests 2, 3, and 4. That with a 
variance 2 runs through Tests 1 
and 4. The other factors are 
specifics. The four test variances being each 10, the 
correlation coefficients are written down from the overlaps 
by inspection as : 


Figure 10. 


1 2 3 4 
1 | ce) -4 E 2 
2 4 (7) 7 3 
3 z 7 (7) E 
4 2 3 3 (5) 


Moreover, we can put into our matrix the communalities 
corresponding to our diagram. Each communality is, in 
fact, that fraction of the variance of a test which is not 
specific. "Thus -6 of the variance of Test 1 is “ communal," 
-4 being specific or “ selfish.” In this way we have the 
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matrix above, with communalities inserted. We can now 
pretend that it is an experimental matrix, ready for the 
application of Thurstone’s method, as follows : 


(6) 4 p 2 

4 (7) 7 3 Original 

E 7 (-7) 3 experimental 
2 3 3 (5) matrix. 

1:6 24 21 1:3 = 7-1 = 2:6646? 


1st Loadings | -6005 "7881 “7881 4879 = 2-6646* 


-6005 (3600) -4733 -4733 -2930 

"7881 4733 (6211) -6211  .3845  First-factor 
"7881 4783 — -6211 (6211) -3845 matrix. 
4879 :2930 — -3845 -3845 (-2380) 


Here it is seen that the loadings of the first factor, when 
cross-multiplied in a chequer board, give a first-factor 
matrix which is not identical with the original experimental 
matrix, unlike the case of the former, hierarchical, matrix. 
Here (as we who made the matrix know) one factor will 
not suffice. We subtract the first-factor matrix from the 
original experimental matrix to see how much of the 
correlations still has to be explained, and how much of the 
" eommunalities " or communal variances. The latter 
were— 

6 “7 ‘7 5 
and of these amounts the first factor has explained— 
+3606 6211 “6211 +2380 


If we subtract the first-factor matrix, element by element, 


from the original experimental matrix, we get the residual 
matrix : 


(2394) -0733 -0733 -0930 

— 0733 (0789) :0789 — -0845 First residual 
— -0733 -0789 (0789) —-0845 matrix. 
—:0930 — -0845 — -0845 (-2620) 


* This check should always be ap 
it is not printed in the later tables, 
their temporary signs (see page 72). 


plied. To avoid complication 
It applies to the loadings with 


mt 
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To this matrix we are now going to apply exactly the same 
procedure as we applied to the original experimental 
matrix, in order to find the loadings of the second factor. 
But we meet at once with a difficulty. The columns of the 
residual matrix add up exactly* to zero! This always - 
happens, and is indeed a useful check on our arithmetical 
work up to this point, but it seems to stop our further 
progress. 

To get over this difficulty we change temporarily the signs 
of some of the tests in order to make a majority of the cells 
of each column of the matrix positive. The best plan is to 
change the sign of the test with most minuses in its column 
and row, and so on until there is a large majority of plus 
signs. Copy the signs on a separate paper, omitting the 
diagonal signs, which never change. Since some signs 
will change twice or thrice, use the convention that a 
plus surrounded by a ring means minus, and if then 
covered by an X means plus again. Near the end, watch 
the actual numbers, for the minus signs in a column may 
be very small. The object is to make the grand total 
a maximum, and thus take out maximum variance with 
each factor. We shall here, however, for simplicity adopt 
an easier rule, i.c. to seek out the column whose total 
regardless of ign is the larges, and then temporarily change 
the signs of variables so as to make all the signs in that 
column positive. SEE SEM 

The sums of the residual columns, regardless of sign, are— 


A190 +3156 +8156 +5240 


and therefore we must change the signs of tests so as to 


make all the signs in Column 4 positive ; that is, we must 
change the signs of the first three tests.} Since we change 
the three row signs, as well as the three column signs, this 
will leave a block of signs unchanged, but will make 
the last column and the last row all positive. We can 
then proceed as shown overleaf. 


* When enough decimals have been retained. In practice there 


may be a discrepancy in the last decimal place. 
+ Changing the sign of Test 4 would here have the same result, 
but for uniformity of routine we stick to the letter of the rule. 
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2394 — 0783 — -0733 (—)-0930 
— +0733 “0789 ‘0789 (—)-:0845 First residual 
— 0733 0789 ‘0789 (—)0845 matrix with 
(—):0930 (—)-0845 (—)-0845 :2620 changed signs. 
1858 “1690 1690 "5240 = 1:0478 
= 1:0236? 
2nd "1815 1651 -1651 :5119 With temporary 
Loadings signs. 
:1815 :0329 -0300 +0300 "0929 
1651 +0300 0273 0273 *0845 Second-factor 
1651 *0300 *0273 0273 *0845 matrix. 
“5119 -0929 0845 0845. :2620 
:2065  — 3033  — -1033 *0001 
— :1033 0516 :0516 : Second residual 
— +1083 +0516 :0516 . matrix. 
*0001 Y 


On the matrix with these temporarily changed signs we 
then operate exactly as we did on the original experimental 
matrix, and obtain second-factor loadings which (with 
temporary signs) are— 


1815 “1651 1651 ‘5119 


The second-factor matrix, that is, the matrix showing 
how much correlation is due to the second factor, is then 
made on a chequer board still using the temporary signs, 
and subtracted from the previous matrix of residues (with 
its temporary signs, not with its first signs) to find the 
residues still remaining, to be explained by further factors. 
In the present instance we see that the whole variance of 
the fourth test entirely disappears, and also all the correla- 
tions in which that testis concerned.* This test, therefore, 
is fully explained by the two factors already extracted. 
Only the first three test variances remain unexhausted, 
and their correlations. Again the columns of the residual 
matrix sum exactly to zero. Following our rule, the signs 
of Tests 2 and 3 have to be temporarily changed before 
the process can continue. After these changes of sign the 


* When enough decimals are retained. We shall treat the 
*0001 as zero. 
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second residual matrix is as follows, and the same operation 
as before is again performed on it : 


+2065 (—):1033 (—)-1033 . Second residual 
(—)1038 :0516 -0516 . matrix with signs 
(—)1033 -0516 -0516 . temporarily 
changed. 
4131 :2065 -2065 . = :8201 = -9089? 
8rd Loadings 4545 -2272 2272 . with temporary 
signs. 


With these third-factor loadings we can now calculate the 
variances and correlations due to the third factor : and we 
find these are exactly equal to the second residual matrix. 
On subtracting, the third residual matrix we obtain is 
entirely composed of zeros. (In a practical example we 
should be content if it was sufficiently small. We thus 
find (as our construction of the artificial tests entitled us to 
expect) that the matrix of correlations can be completely 
explained by three common factors. 

After the analysis has been completed, some care is 
needed in returning from the temporary signs of the load- 
ings to the correct signs. The only safe plan is to write 
down first of all the loadings with their temporary signs 
as they came out in the analysis. In our present example 
these happen to be all positive, though that will not 
always occur. ; 


Loadings with Temporary Signs 
Test I II III 


1l :6005 1815 4545 
2 "7881 :1651 :2272 
3 “7881 “1651 +2272 
4 4879 -5119 


Now, in obtaining Loadings II the signs of Tests 1, 2, and 
3 were changed. We must, therefore, in the above table 
reverse the signs of the loadings of these three tests in 
Column II and each later column. Then in obtaining 
Loadings III the signs of Test 2 and 3 were changed ; that 
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If we were actually confronted with the matrix of correla- 
tions shown on page 69, and asked what the communalities 
were which reduced it to the lowest possible rank, we would 
find it very unsatisfactory to have to guess at random and 
try each set ; and our embarrassment would be still greater 
if there were more tests in the battery, as would actually be 
the case in practice. There would also be sampling error 
(which in this our preliminary description of Thurstone’s 
method we are assuming to be non-existent). Under these 
circumstances, devices for arriving rapidly at approximate 
values of the communalities are, very desirable. 

10. Method of approximating to the communalities.—Thur- 
stone has described many ways of estimating the com- 
munalities, and articles still issue from his laboratory on . 
this subject. He points out, however, that if the number 
of tests is fairly large, an exact estimate is not very import- 
ant, and can in any case be improved by iteration, using 
the sums of squares of the loadings for a new estimate. 

The simplest plan is to use as an approximate com- 
munality the largest correlation coefficient in the column. 
That this is plausible can be seen from a consideration of the 
case where there is only one factor, when the communality 
of Test 1 would be rj, . 743/723, which is likely to be roughly 
equal to either 7;, or 7,3 if these tests correlate highly with 
Test 1 and probably therefore with each other. 

We shall illustrate this, the easiest, method on the same 
example as we used above, for the sake of. comparison and 
for ease in arithmetical computation, even although that 
example is really an exact and artificial one unclouded by 
sampling error. Inserting then the highest coefficients in 
each column we get : 


(5883) -4 4 2 -5883 
UU (17 ane :8 -2852 
4 A. (Cn) eda :2852 
2 3 3B — (8) 4480 
:5883 .2852 -2852 -1480 (5883) 


2-1766 2.3852 2-3852 1-2480 1-8950 — 10-0900 
8-1765? 


Il 
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First 


Loadings -6852 -7509 -7509 3929 -5966 

The communalities which really give the minimum rank 

are, as we saw on page 82— 
orf arf ‘7 303 :5 

and the correct first-factor loadings obtained by their use— 
7257 -7564 7564 -3420 -5729 

With a large battery the difference between the loadings 
obtained by the approximation and by the correct com- 
munalities would be much less. For the “ centroid " method 
depends on the relative totals of the columns of the correla- 
tion matrix; and when there are twenty or more tests, 
these relative totals will not be seriously changed by the 
exaet value given to the communality in the column. 
When the number of tests is large, the influence of the one 
communality in each column is swamped by the influence 
of the numerous correlations. 

The process now goes on as on page 71, and the residuals 
left after subtraction of the first-factor matrix check by 
summing in each column to zero, as there. 

Before, however, proceeding any farther, in this approxi- 
mate method we delete the quantities in the diagonal (the 
residues of the guessed communalities) and replace them by 
the largest coefficient in the column regardless of its sign, 
which we change to plus in the diagonal cell if it is negative 
in its own cell.* The reason for this is apparent, especially 
when, as may and does happen, the existing. diagonal 


' residues are negative, which is theoretically impossible. 


For although the guessing of the first communalities does 
not in a large battery make much difference to the first- 
factor loadings, it may make a big difference to the diagonal 
residues. If the battery is very large indeed, our first- 
factor loadings would come out much the same, even if we 
entered zero for every communality, but the diagonal 
residues would then all be negative. In short, the diagonal 
residues are much the least trustworthy part of the calcu- 


eye on the fact that what is inserted 


* It is necessary to keep an 
f the previous loadings of that test, 


must not, with the squares 0 
amount to more than unity. 
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is, in our case changed back to positive. The loadings 
with their proper signs are therefore as shown in the first 
three columns of this table : 


Loadings of the Factors (Signs Replaced) 


Test NC T — - 
I II DL Specific 
1 6005 — -1815 — -4545 | -6324 
2 ‘7881 — +1651 + 2272 i ; ATT . 
3 “7881 — 3051 + -2272 J 5477 . 
4 "A879 "5119 | "7071 


In this table each column of loadings, for the common 
factors after the first, adds up to zero. The loading of the 
specific is found from the fact that in each row the sum of 
the squares must be unity, being the whole variance of the 
test. The inner product * of each pair of rows gives the 


correlation between those two tests (Garnett, 19194). 
Thus— 


Ty, = 6005 X -7881 + -1815 X +1651 — +4545 x -2272 = -4000 


in agreement with the entry in the original correlation 
matrix. With artificial data like the present, the analysis 
results in loadings which give the correlations back exactly. 

It will be seen that all the signs in any column of the 
table of loadings can be reversed without making any 
change in the inner products of the rows ; that is, without 
altering the correlations. We would usually prefer, there- 
fore, to reverse the signs of a column like our Column III, 
so as to make its largest member positive. 

The amount which each factor contributes to the variance 
of the test is indicated by the square of its loading in that 
test. The sum of the squares of the three common-factor 
loadings gives the “ eommunality " which we originally 


* By the “inner product " of two series of numbers is meant the 
sum of their products in pairs. Thus the inner product of the two 
sets: 

a b c d 
and A B [6] D 
is aA + bB -- cC -- dD 
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deduced from Figure 10 and inserted in the diagonal cells of 
our original correlation matrix. These facts can be better 
seen if we make a table of the squares of the above loadings : 


Variance contributed by Each Factor 


Test | s | 
| I II III | Communality Sp ecific Total 
| Variance | 
1 +3606 — :0329 +2065 -6000 -4000 | - 1 
2 +6211 0273 -0516 "7000 -3000 | 1 
3 -6211 0273 -0516 | 7000 -3000 It 
4 -2880 -2620 i -5000 -5000 | 1 
| 
Total | 1:8408  -3495  -3097| 2-5000 | 1-5000 | 4 


6. Comparison of the analysis with the diagram.—The 
reader has probably been turning from this calculation of 
the factor loadings back to the four-oval diagram with 
which we started (page 69), to detect any connexion ; and 
has been disappointed to find none. The fact is that the 
analysis to which the Thurstone method has led us is, 
except that it too has three common factors, a different 
analysis from that which the original diagram naturally 
invites. That diagram gave for the variance due to each 
factor the following : 


Variance contributed by Each Factor 
Test j | 
I II II | Communality| SPee | Total 

Variance | 
1 4 à 2 6 | MEE sa 
2 4 E m 3 beet 
8 4 ES , 7 | Su et 
4 a 3 -5 | -5 Ip: 
Totals| 1.2 9 4 2-5 | 15 Lod 

| 


and the factor loadings are the positive square roots of 
these. 
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| Loadings of the Factors 


Test | — — - 
| 4 II III Specifics 
1 | -6325 à “4472 6324. ; 
2 | -6325 -5477 A ; -5477 
3 | -6325 5477 ; k : "5477 : 
DRM BATT dy | ; ; 4071 


The only points in common between the two analyses are 
that they both have the same communalities (and therefore 
the same specific variances) and the same number of com- 
mon factors. The Thurstone analysis has two general 
factors (running through all four tests), while the diagram 
had none : and the Thurstone analysis has several negative 
loadings, while the diagram had none. We shall see later 
that Thurstone, after arriving at this first analysis, en- 
deavours to convert it into an analysis more like that of 
our diagram, with no negative loadings and no completely 
general factors. This is one of the most difficult yet 
essential parts of his method. 

7. Analysis into two common faclors.—When we began 
our analysis of the matrix of correlations corresponding to 
Figure 10, we simply put the communalities suggested by 
that figure into the blank diagonal cells. That served to 
illustrate the fact that the Thurstone method of caleulation 
will bring out as many factors as correspond to the com- 
munalities used, here three factors. But it disregarded 
(intentionally for the purpose of the above illustration) a 
cardinal point of Thurstone's theory that we must seek 
for the communalities which make the rank of the matrix a 
minimum, and therefore the number of common factors a 
minimum. We simply accepted the communalities sug- 
gested by the diagram. Let us now repair our omission 
and see if there is not a possible analysis of these tests into 
fewer than three common factors. There is no hope of 
reducing the rank to one, for the original correlations give 
two of the three tetrads different from zero, and we may 
(in an artificial example) assume that there are no experi- 
mental or other errors. But there is nothing in the experi- 
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mental correlations to make it certain that rank 2 
cannot be attained. With only four tests (far too few, be 
it remembered, for an actual experiment) there is no minor 
of order three entirely composed of experimentally obtained 
correlations. It may then be the case that communalities 
can be found which reduce the rank to 2. Indeed, as we 
shall see presently, many sets of communalities will do so, 
of which one is shown here : 


(-26) E 4 -2 
E (7) 7 E 
4 7 (7) 3 
2 3 3 (18) 


These communalities -26, -7, 7, and -15 make every 
three-rowed minor exactly zero. For example, the minor 


(-26) 4 2 
“4 (7) 3 
2 3 (13) 


becomes by “ pivotal condensation ” : 


-026 0 
0 0 
and finally j 0 


It must, therefore, be possible to make a four-oval 
diagram, showing only two common factors, and indeed 


Figure 11. 


more than one such diagram can be found. One is shown 
in Figure 11. 
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lation when approximate communalities are used, and it is 
better to delete them at each stage and make a new 


approximation. 


11. Illustrated on the ezample.—To make this clearer, the 
whole approximate process is here set out for our small 
example as far as the second residual matrix. The ex- 
planations printed alongside the calculation will make 


each stage clear. 


It is important to form the residual 


matrices exactly as instructed, as otherwise the check of 
the columns summing to zero will not work. In practice, 
certainly if a calculating machine were being used, several 


of the matrices here 
for example, 
A to C, while D and E would be 
C itself : 


printed for clearness would be omitted f 
with a machine one would go straight from 
made by actually altering 


(5883) -4 E :2 5883 
E (7) 3n 3 2852 | Largest r of 
A E “7 (7) 3 2852 | column inserted 
2 3 3 (3) 1480 | in diagonal cell. 
:5883 :2852 :2852 *1480 — (:5883) 
2:1766 2:3852 2:3852 1-2480 1:8950 = 10-0900 
= 8:1765?" 
Loadings I| -6852 "7509 7509 :8929 :5906 = 3-1765 
*6852 (4695) -5145 *5145 2692 +4088 
"7509 :5145 (-5639) -5639 2950 4480 | ... i 
B| -7509 | .5145 -5639 (5039) -2950 -q4go | First-factor 
73929 | .2692 -2950 -2950 (1544) :2344 | Matrix. 
5966 4088 4480 44.80 2344 — (-3559) 
(1188) —1145 —-1145 —-0692 -1795 
—145 (1361) -1361  -0050 —-1628 | First residual 
C — 1145 1361 (31361) -0050 —-1628 matrix. 
—:0692 *0050 :0050 (1456) —-0864 4 — B 
1795 —-1628 —-1628 —-0864 (-2824) 
:0001. —-0001 —-0001 *0000 —-0001 | Columns check 
to zero. 
(1795) —1145 —-1145 —-0692  -1795 Largest r of each 
—1145 (1628) -1361  -0050 —-1628 column (regard- 
D —-1145 1361 (1628) -0050 —-1628 less of sign) in- 
—:0692 *0050 :0050 = (-0864) —-0864 serted in each 
1795 —:1628 —-1628 — 0864 (1795) diagonal cell. 
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6572 -5812  -5812 -2520 -7710 | Sum disregard- 
ing signs. 


(1795) 1145 1145 +0692 -1795 | Signs of Tests 2, 
-1145 (1628) 1361 -0050 -1628 | 3, and 4 changed 


E 3145 -1361 (16028) -0050  -1628|to make largest 
*0692 *0050 -0050 — (0864) -0864| column (+7710) 
1795 1628 1628 +0864 — (:1795)| all positive. 
Algebraic 
Sum .6572 -5812 -5812 2520 7710 = 2-8426 
= 1-6860? 


LoadingsII| +8898 3447 3447 -1495 4573 (With temporary 


signs.) 


"8898 (1519) 1344 1344: -0583 1783 
"344 T 1844 (1188) -1188  -:0515 +1576 | Second-factor 


F| 3447 4344  -1188 (1188) -0515 -1576 matrix, using 
11495 .0588  -0515  -0515 (0124) +0683 temporary signs. 
4573 1783 1576 1576 -0083 (2091) 

(0276) —-0199 —-0199 :0109 -0012 
—.0199 (0440) 0178 —-0465 -0052 Second residual 
G —.0199  -0173 (0440) —-0465  :0052 | matrix. 


-0109 —-0465 —-0465 (-0640)  -0180 E— F 
«0012 -0052 -0052 -0180 (—-0296) 


—:0001 —-0001 -0001 —-0001 -0000 | Columns check 
to zero. 


Notes.—It is fortuitous that all the entries in E are positive. 


Usually some will be negative. 
In the check for the residual matrices, a discrepancy from zero 
in the last figure is often to be expected, even of three or four units 


in a large matrix. 
Note the negative value occurring in a diagonal cell in G. 


Further stages would be carried on in the same way. 
But at each stage the residues will be examined to see if 
further analysis is worth while, by methods indicated later. 
Meanwhile let us assume in the present example that no 


more factors need be extracted. 
The matrix of loadings of common factors thus arrived 


at is, after we have replaced the proper signs in Load- 


ings II, shown at the top of the next page. 
The communalities -6214, etc., are the sums of the 


squares of the two loadings. For comparison with the 
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This gives exactly the same correlations. For example— 


1242 = 
* : 
* ^ (20x20) 20 
12 12 
T34 3 


^ 4/(20 x 80) 40 


It also gives the communalities -26, “7, c7, 15. For 
example, in Test 1, variance to the amount of 12 out of 
45 is communal, and 12/45 = -26. 

The insertion of these communalities, therefore, in the 
matrix of correlations ought to give a matrix which only 
two applications of Thurstone's caleulation should com- 
pletely exhaust. The reader is advised to carry out the 
calculation as an exercise. He will find for the first-factor 
loadings— 

-5000 :8290 :8290 :8750 


and if in the first residual matrix, following our rule, he 
changes temporarily the signs of Tests 2 and 3, the second- 
factor loadings will be— 

:1291 — :1128 — :1128 :0968 


The second residual matrix will be found to be exactly 
zero in each of its sixteen cells. The variance (square of 


the loading) contributed by each factor to each test is then 
in this analysis : 


Variance contributed by Each Factor 
Test 
| | Sect 
I II | Communality pecifie | Total 
| Variance 

1 +2500 “0167 | :2667 "(3833 1 
2 6873 "0127 | "7000 +3000 1 
8 6873 0127 | “7000 +3000 1 
4 1406 "0094 | "1500 "8500 1 
Totals | 1:7652 "0515 | 1-8167 | 2-1833 4 


If we now compare these analyses, we see that the three 
common factors of the previous analysis “ took out,” as 
the factorial worker Says, a variance of 2-5 of the total 4, 
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leaving 1-5 for the specifics. The present analysis leaves 
2-1833 for the specifics, which here form a larger part of 
the four tests. 

8. Alexander’s rotation—We saw in Section 6 that the 
Thurstone method there led to an analysis which was 
different from the analysis corresponding to the diagram 
with which we began. That is also the case with the 
present analysis into two common factors—the very fact 
that it gives the second factor two negative loadings shows 
this, for the diagram (Figure 11) corresponds to positive 
loadings only. We said, too, in Section 6 that a difficult 
part of Thurstone’s method was the conversion of the 
loadings into new and equivalent loadings which are all 
positive. This will form the subject of a later and more 
technical chapter ; but a simple illustration of one method 
of conversion (or “ rotation " as it is called, for a reason 
which will become clear later) can be given from our present 
example. It is a method which can be used only if we have 
reason to think that one of our tests contains only one 
common factor (Alexander, 1935, 144). Let us suppose in 
our present case that from other sources we know this fact 
about Test 1. The centroid analysis has given us the 
loadings shown in the first two columns of this table : 


Unrotated Rotated Rotated 
Test Loadings Communality Loadings Loadings 
I Il I* iI | Ie qe 
1 | +5000 11291 :2667 5164 5 "4781 -1952 
2 |.8290 — -1128 "7000 ‘T7746 -3162 | 8367 
3 :8290  — -1128 "7000 “7746-3162 | -8367 4 
4 | 3750 *0968 1500 -8873 : +3586 1464 


The communalities are also shown; they are the sums of 
the squares of the loadings. If now we know or decide to 
assume that Test 1 has really only one common factor, and 
if we want to preserve the communalities shown, then the 
loading of factor I* in Test 1 must be the square root of 
:2667, namely :5164. 

The loadings of factor I* in the other three tests can 
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now be found from the fact that they must give the corre- 
lations of those tests with Test 1, since Test 1 has no 
second factor to contribute. The loadings shown in 
column I* are found in this way: for example, -7746 is 
the quotient of -5164 divided into rj, (4), and :3873 is 
similarly 7,, (-2) divided by -5164. 

The contributions of factor I* to the communalities are 
obtained by squaring these loadings. In Test 1, we 
already know that factor I* exhausts the communality, for 
that is how we found its loading. We discover that in 
Test 4, factor I* likewise exhausts the communality, for 
the square of -3873 is -1500. The other two tests, however, 
have each an amount of communality remaining equal to 
"1000 (i.e. -7000 — -77462). The square root of -1000, 
therefore (3162), must be the loading of factor II* in 
Tests 2 and 3. The double column of loadings ought now 
to give all the correlations of the original correlation 
matrix, and we find that it does so. Thus, e.g.— 


Tay = "T7406 X -7746 + -3162 x -8162 = -7000 
and r$ = 7746 x -3873 = -8000 


Moreover, the analysis into factors I* and II* corre- 
sponds exactly to Figure 11. For example, the loading of 
factor II* in Test 2 in that diagram is the square root of 
2/20 (-3162) ; and the loading of factor I* in Test 4 is the 
Square root of 12/80 (-3873). 

If, however, the experimenter 
had reasons for thinking that Test 
2 (not Test 1) was free from the 
second common factor, his “ rota- 
tion ” of the loadings would have 
given a different result, shown in 
the table on page 79 in column I** 
and II**. This set of loadings 
also gives the correct commu- 

Figure 12, nalities and the experimental corre- 
lations, but does not correspond 

to Figure 11. A diagram can, however, be constructed to 
agree with it (Figure 12) and the reader is advised to check 
the agreement by calculating from the diagram the load- 


- 
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ings of each factor, the communalities of each test, and the 
correlations. 

We have had, in Figures 10, 11, and 12, three different 
analyses of the same matrix of correlations. If with 
Thurstone we décide that analyses must always use the 
minimal number of common factors, we will reject Figure 10. 
Between Figures 11 and 12, however, this principle makes 
no choice. Much of the later and more technical part of 
Thurstone’s method is taken up with his endeavours to 
lay down conditions which will make the analysis unique. 

9. Unique communalities.—The first requirement for a 
unique analysis is that the set of communalities which gives 
the lowest rank should be unique, and this is not the case 
with a battery of only four tests and minimal rank 2, like 
our example. There are many different sets of com- 
munalities, all of which reduce the matrix of correlations 
of our four tests to rank 2. If, for example, we fix the 
first communality arbitrarily, say at -5, we can condense 
the determinant to one of order 3 by using -5 as a pivot 
(as on page 65) except that the diagonal of the smaller 
matrix will be blank : 


(-5) E E 2 
-4 ; -7 38 
“4 > UD 3 
2 3 3 ; 

: 19 07 
19 ; 07 
07 07 


We can then fill the diagonal of the smaller matrix with 
numbers which will make each of its tetrads zero, namely— 


“19 19 "0258 
and then, working back to the original matrix, find the 


communalities— 
5 7 7 1316 


which make its rank exactly 2. We can similarly insert 
different numbers for the first communality and calculate 
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different sets of communalities, any one set of which will 
reduce the rank to 2. In this way we can go from 1:0 
down to 0-22951 for the first communality without obtain- 
ing inadmissible magnitudes for the others. Some sets 
are given in the following table * : 


1 2 3 4 Sum 

1:0 7 7 12963 2-52903 
5 7 7 13158 2:03158 
3 7 7 14 1:84 
-26 7 7 15 1:816 
-25 7 “7 16 1:816 
24 T 7 +20 1:84 
:22951 T 7 1-0 2-62951 


If, however, we search for and find a fifth test to add to 
the four, which will still permit the rank to be reduced to 
2, this fifth test will fix the communalities at some point 
or other within the above range. Suppose that this test 
gave the correlations shown in the last row and column : 


1 2 3 4 5 
1 . 4 E] 2 5883 
2 E 7 3 +2852 
3 4 7 a 3 +2852 
4 : 3 3 s *1480 
5 5883 :2852 :2852 1480 . 


If we now try to find communalities to reduce this 


matrix to rank 2 (as can be done), we find only the one 
Set— 


oh of i :13030 5 


The reader can try this by assigning an arbitrary value for 
the first one,f and then condensing the matrix on the lines 


* The circumstance that the communalities of Tests 2 and 3 
remain fixed and alike is due to these tests being identical except for 
their specific. This lightens the arithmetic, but would not occur 
in practice. 

T Alternatively, the communalities (which are now unique) can 
be found by equating to zero those three-rowed minors which have 
only one element in common with the diagonal. 


In this connexion 
see Ledermann, 1937a. 
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employed above, when he will always find some obstacle 
in the way unless he chooses °7. Try, for example, -5 for 
the first communality : 


Ly 


(5) “4 4 2 -5883 
p " “7 3 -2852 
a 7 ; 3 +2852 
2 3 3 2 -1480 
+5883 -2852 -2852 -1480 

(x) 19 07 — -09272 

19 b 07 — 09272 

07 07 5 — 04366 
— -09272  —-09272 — :04366 


Now, if the upper matrix is to be of rank 2, the 
second condensation must give only zeros (see footnote, 
page 65). But if we fix our attention on different tetrads 
in the lower matrix which contain the pivot x, we see that 
they give, if they have to be zero, incompatible values for 
x. Thus from one tetrad we get æ =-19, from another 
æ = -14866. With -5 as first communality, rank 2 
cannot be attained. With five tests (or more), if rank 2 
can be attained at all, it can be by only one unique set of 
communalities. Just as it took three tests to enable the 
saturations with Spearman’s g to be calculated, so it takes 
five tests to enable communalities due to two common 
factors to be calculated. For larger numbers of common 
factors, the number of tests required to make the set of 
communalities unique is shown in the following table 
(Vectors, 77). The lower numbers * are given by the 
formula— 

(2r 4- 1) + (8r + 1) 
2 


T Factors d Did 8.945 X53 ONSTUREUS. H9. 510 PEL MU 


nz 


n Tests 8.5010 18: 9 5107712" X83 133519 17 18 


* With six tests the communalities which reduce to rank 3 are 
not necessarily unique, for there are, or there may be, two sets of 


them. See Wilson and Worcester, 1939. 
I think the ambiguity, which is not practically important, only 
occurs when n is exactly equal to the formula, e.g. when r = 3, 6, 


10, etc. 
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| 


Approximate Method True Values 
POSH ae PTT = 
I | II Communality | Communality 
1 :6852 | :3898 “6214. | "7000 
2 “7509 | — 3447 +6827 | "7000 
3 "7509 | — 8447 “6827 | “7000 
4 8929 | —-1495 | “1767 1303 
5 -5966 | 4578 | *5651 :5000 
| 
| 2-7286 2-7303 


approximate communalities thus obtained there are shown 
the true values, which in this artificial case are known to 
us (see Section 9). This is for instructional purposes 
only—the comparison is not intended as any criticism of 
Thurstone’s method of approximation. As has been 
explained, this method is used only on large batteries, and 
it is a very severe test indeed to employ it on a battery of 
only five tests, 

12. Iteration of the process to improve the communalities.— 
We might now go back and begin our whole calculation 
again, using the communalities -6214, ete., arrived at by 
the first approximation. This does not seem often to be 
done in practice, most workers being content with the 
approximation first arrived at. If we repeat the calcula- 
tion again and again with our present example, on each 
occasion using as communalities the sum of the squares of 
the loadings given by the preceding calculation, we get the 
following sets of closer and closer approximation to the 
true communalities* : 


| h? | pel mh | h | hg 
First trial commu- | | 
nalities -5883 "7000 "7000 :3000 *5883 
Next approximation | -6214 "6827 6827 1767 *5651 
Next approximation *6381 *6970 *6970 1477 +5892 
Next approximation -6535 "1048 "1043 :1397 -5253 
True values "7000 | -7000 | -7000 | -1303 | -5000 


* In these repetitions we do not, as in the case of the first guess, 
alter the diagonal cells in each matrix of residues: we retain the 
diagonal residues without change. 
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The example has served to show how to work the 
iterative method of approximating to the communalities. 
Being an artificial example, and not overlaid with sampling 
error, it has had the advantage of allowing us to compare 
the approximations with the true values. But it must be 
remembered that a real experimental matrix is not likely 
to have an exact low rank to which approximation can 
converge as here. In that case the approximations will 
presumably give an indication of the low rank which the 
matrix nearly has, which it might be made to have by small 
adjustments in its elements. 

It should be pointed out that iteration of each factor 
extraction separately will not give the same result. By 
iteration of the factors one by one we mean that after the 
loadings of the, first factor are obtained they are squared 
and put into the diagonal cells as new communalities, and 
this is repeated again and again until the communalities 
remain unchanged. When this point is reached, the orig- 
inal matrix of correlations has been reduced as nearly to 
rank one as is possible. 

If the residues, after removal of the first factor, are then 
(after sign-changing) treated in the same way, they in 
turn will be reduced as nearly as possible to rank ome. 
And so with successive residues, each matrix of residues 
being in succession reduced as nearly as possible to rank 
one by iteration of the one summation only. This process, 
although much easier than reiterating the whole process, 
and to that extent excusable, will not give the lowest pos- 
sible rank for the whole. Consider, for example, the 
correlations of the five tests used above on page 82. When 
communalities are reiterated with the first factor only, 
they settle down rapidly (the reader should check this) to— 


4571 -5421 -5421 1261 +2729 
When the residues then left are taken, and a factor taken out 
and iterated, the communalities settle down to— 

“1677 +1003 -1003 01138 -1680 
The sum of these first-factor and second-factor sets is the 


set— 
-6248 -6424 -6424 1374 -4409 
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These, however, if inserted in the diagonal cells of the 
original matrix, do not reduce it exactly to rank two, as 
can be done by the true communalities— 


“7000 "7000 "7000 :1308 -5000 


Iteration over two factors, as shown in the table on page 88, 
produces with four repetitions the approximations— 


-6585 "7048 "7043 1397 -5253 


and (since in this artificial example rank two can be exactly 
reached) would ultimately converge to the above true 
values, though at the expense of much labour, for the 
convergenceisslow. Theiteration of each factor separately, 
however, would never converge to the true values. The 
above values (-6248, etc.) are final, and yet do not give 
rank two. 

18. Other methods of assessing the communalities.—The 
labour of finding the minimum communalities by iteration 
is so great that methods of improving the first guess are 
desirable. Medland (Pmka. 1947, 12, 101-10) has tried 
nine such methods on a correlation matrix with 63 vari- 
ables.. A method entitled Centroid No. 1 method seemed 
to be best. A sub-group is chosen of from three to five 
tests which correlate most highly with the test whose 
communality is wanted. The highest correlation ¢ in each 
column of the sub-group is inserted in the diagonal cell, 
and the columns summed. ‘The grand total is also found. 
Then the estimate of h? is— 

(27, + 4)? 


Sr + St 


where the numerator is the square of the column total, 
and the denominator is the grand total. Thus if the cor- 
relations of the sub-group were— 


(72) — 72 -63 :24 
42. (72) 47 -59 
63 47 — (68) -4l 
24. -59 4 (-59) 


2-81 2-50 2-14 1:83 = 8-78 
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the estimate of hj would be— 


2-31? 

—— = -608 

8-78 
Clearly the same sub-group will usually serve for more than 
one of its members. Thus from the above example hj 
can be estimated to be -712. 

A graphical method, for which the reader is referred to 
Medland’s article, was about equally accurate but more 
laborious. Rosner (Pmka. 1948, 18, 181-4) gives an alge- 
braie solution for the communalities depending upon the 
Cayley-Hamilton theorem that any square matrix satisfies 
its own characteristic equation, but adds that the method 
“is not at all suited for practical purposes. The com- 
putational labour is prohibitive.” It is, however, interest- 
ing theoretically and may suggest new advances. 


CHAPTER VI 
THE GEOMETRICAL PICTURE 


1. The scatter-diagram of two tests—A well-known way 
of representing correlation, and that used by Sir Francis 
Galton who devised correlation coefficients, is by a scatter- 
diagram. ‘The scores in two tests are used as rectangular 
abscissæ and ordinates, and each person represented by a 


TEST 2 


ee eee 


—TEST | 
X72 TEST 


Figure 13. 


dot. Thus, if a person makes a score of X = 72 ina Test 1 
and of Y = 59 in a Test 2, he is represented by the point P. 
The two tests are represented by the rectangular axes. 
If a large number of persons take the two tests, their points 
form the **scatter-diagram," looking like a lot of shots at a 
target. The dots are most densely crowded together near 
a point whose ordinates are the average scores in the two 
tests. If there is no correlation between the two tests, 
and suitable units are used, the dots will thin out equally 
in all directions, forming a circular-shaped group. If, on 
the other hand, there is correlation, the group of dots will 
be elliptieal in appearance, with an axis slanting-wise 
inclined to the test lines; and more and more elliptical— 
92 
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the closer the resemblance of the scores, the higher, that 
is, the correlation. If we have first standardized the scores, 
the test lines will pass through the centre of the group, the 
average, and the axis of the ellipse will be equally inclined 
to both tests. In Figure 14 it is indicated how the ellip- 
tical group of dots narrows in the one direction, and 
lengthens in the other, with increasing correlation. The 
circle corresponds to zero correlation, the fat ellipse to 


TEST2 Va 


| TESTI 


^4 
Figure 14. 


r = -5, the long thin one tor = -9. In perfect correlation 
all the dots would be on a line. In negative correlation 
the ellipse would be slanting the other way. These 
ellipses must not be looked upon as bounding the group of 
dots, which thins out to an indefinite distance. They are 
like contours of a hill, being, in fact, “contours” of the 
density of the dots. 

2. Three iests.—When we have three tests we need 
three rectangular axes, like the three lines which meet in 
the corner of a room. ‘A person's three scores, measured 
along these lines, define a point in solid space, a point in 
the room. The points thus representing a large number 
of persons will form a swarm in the room, congregated 
most thickly round the man who is average in all three 
tests, like a swarm of bees round the queen. If there is no 
correlation between any of the tests and suitable units are 
used, the swarm will be globular, but if there is correlation 
it will lengthen into an ellipsoidal shape like a Rugby 
football or a Zeppelin, though its waistline need not be 
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dot. Thus, if a person makes a score of X = 72 in a Test 1 
and of Y — 59 in a Test 2, he is represented by the point P. 
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If a large number of persons take the two tests, their points 
form the “scatter-diagram,” looking like a lot of shots at a 
target. The dots are most densely crowded together near 
a point whose ordinates are the average scores in the two 
tests. If there is no correlation between the two tests, 
and suitable units are used, the dots will thin out equally 
in all directions, forming a circular-shaped group. If, on 
the other hand, there is correlation, the group of dots will 
be elliptical in appearance, with an axis slanting-wise 
inclined to the test lines; and more and more elliptical— 
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the closer the resemblance of the scores, the higher, that 
is, the correlation. If we have first standardized the scores, 
the test lines will pass through the centre of the group, the 
average, and the axis of the ellipse will be equally inclined 
to both tests. In Figure 14 it is indicated how the ellip- 
tical group of dots narrows in the one direction, and 
lengthens in the other, with increasing correlation. The 
circle corresponds to zero correlation, the fat ellipse to 


TEST2 72 
TESTI 
Figure 14. 
r = :5, the long thin one tor = :9. In perfect correlation 


all the dots would be on a line. In negative correlation 
the ellipse would be slanting the other way. These 
ellipses must not be looked upon as bounding the group of 
dots, which thins out to an indefinite distance. They are 
like contours of a hill, being, in fact, “contours” of the 
density of the dots. 

2. Three iests.—When we have three tests we need 
three rectangular axes, like the three lines which meet in 
the corner of a room. ‘A person's three scores, measured 
along these lines, define à point in solid space, a point in 
the room. The points thus representing a large number 
of persons will form a swarm in the room, congregated 
most thickly round the man who is average in all three 
tests, like a swarm of bees round the queen. If there is no 
correlation between any of the tests and suitable units are 
used, the swarm will be globular, but if there is correlation 
it will lengthen into an ellipsoidal shape like a Rugby 
football or a Zeppelin, though its waistline need not be 
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circular. In place of the ellipses of the two-dimensional 
figure, we now have ellipsoidal shells of equal density of the 
dots representing persons. One such is shown in Figure 
15, which the reader can imagine as being the room in 
which he is seated, the test lines, in their positive halves, 
being represented by the three edges of floor and walls 


TEST 2 


TESTI 


Figure 15, 


which meet in a corner, where the point representing the 
average man is placed. The ellipsoidal swarm is then 
partly in the room, partly outside and below it. The part 
of the swarm in the room (in the positive octant, that is) is 
composed of persons scoring above the average in all three 
tests. The end of the major axis of the ellipsoid, that is, 
the longest line that can be drawn in it, is shown project- 
ing. If the tests have all been sta dardized, this major 
axis will equally inclined Weie i $$." The shadow ot 
the ellipsoid, projected at right angles on to a wall or the 
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floor of the room, will be a correlational ellipse due to the 
two tests edging that wall, or edging the floor. These 
three silhouettes will in general be different, depending on 
the adiposity, as it were, of the ellipsoid. 

When we have more than three tests we cannot make or 
easily- imagine a similar model, for we know in real life 
only space of three dimensions. But mathematically we 
still can conceive of as many rectangular axes as there are 
tests, in a “space” of more dimensions, of as many 
dimensions, indeed, as the number of tests. And we still 
speak of the “ ellipsoidal ” shape of the swarm of persons. 

8. The four quadrants.—Let us now return to the case 
of two tests. If the persons tested are numerous it will, 


b a 


a b 


with most tests, be found that the numbers in the two 
quadrants marked a are approximately equal (the axes 
being drawn, it is understood, through the average score 
of each test) and, similarly, the numbers in the two quad- 
rants marked b in the figure. 

A portion a of the crowd of persons, that is, get scores 
above the average in each test, and an equal portion a are 
below the average in each. These people add to the 
correlation between the tests, whereas the others, in the 
b quadrants, are all good in one but bad in the other test 
and detract from the correlation. It can then be shown 


ERRATA (Fiera EDITION) 


The last complete sentence on page 94 and the last 
sentence of section 5 on page 97 are incorrect and should 
be deleted. The major axis is nof equally inclined, in 
general, to the orthogonal test lines. 


hy? 
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then 


1000 d 
r = cos 0 = cos x 180° = cos 60° = 0-5 
3000 


Wictual correlation tables will, of course, not show such 
complete equality in the Opposite quadrants, and, more- 
ver, the reader must beware of applying this formula 
nless the dividing lines are drawn through the means. 

4. Making the crowd cireular—We are next going to 
make a change in our model by rotating the two test 
vectors, hitherto at right angles, towards one another until 
the angle between them is the above angle 0, whose cosine 
is the correlation coefficient. A person’s point P will still 
be located at the point where the two perpendiculars from 
his scores meet. "The rotation of the test lines towards one 


they cross, will, however, move the dots re 
Sons, and move them i 


The presence of correlation is not nov 


v shown by the con- 
figuration of the crowd, but by the angle between the test 


lines. The cosine of this angle is the correlation coefficient. 


If we guide the eye by draw 
angles to each test line, we see 1 
4 and b are now represented 
crowd. Perpendiculars from an 
a on to the test lines both fal 


ing a dotted line at right 
that our former quadrants 
by sectors. of the circular 
y point in the white sectors 
l on the same side of the 
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average: all persons situated in these sectors are either 
above the average in both tests (like P) or below in both. 
Anyone, on the other hand, whose point is in the shaded 
sector b is above the average in one of the tests and below 
in the other.: Those in a add to the correlation, those in b 
diminish it. If correlation is perfect, the two test lines 
must be brought together until they coincide: and then 
the dotted lines will also coincide and the sector b will 
disappear. If, on the other hand, the correlation is low, 
the test lines will have to be farther apart, and the sector b 
will increase, until, when correlation is zero, the test lines 
are at right angles and the sectors a and b are equal 
and balance one another, the pros equal to the cons. 
For negative correlation the angle @ between the test 
lines becomes obtuse, and the sectors b larger than the 
sectors a. 

5. Ellipsoid into sphere.—With three tests we saw that 
the solid “ scatter-diagram,” made with the test lines at 
right angles to one another, was ellipsoidal in form. Just 
as we converted the elliptical two-dimensional scatter- 
diagram into a circular crowd of dots by bringing the test 
lines closer together, until the cosine of the angle between 
them equalled the correlation coefficient, so with the ellip- 
soidal swarm of dots when we have three tests. If we take 
hold of the three test lines and swivel them nearer to each 
other, until the angle between each pair represents their cor- 
relation coefficient by its cosine, we then find that the ellip- 
soid has become a sphere. Moreover, we then find that the 
long major axis of the ellipsoid, which (with standardized 
tests) was equally inclined to the three rectangular test 
lines, is not now equally inclined to them, now that they 
have been brought into their new positions—unless, indeed, 
all three correlations are exactly equal. 

6. A wire model.—Let us suppose we want to make a 
wire model of this arrangement of three test lines, supposing 
that we have calculated by the usual product-moment 
formula the three correlation coefficients. Choosing any 
two of the tests, we find from a table of cosines what angle 
has a cosine equal to their correlation coefficient, and we 
lày two straight wires on the table crossing one another at 

TF.A.—4 
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this angle, like an X. Imagine them soldered together at 
the point where they cross, which represents the man 
average in each test. 

Now consider the third test, and look up the angles 
whose cosines equal its correlation coefficients with Tests 
land 2. The wire for this third test must be so placed as 
to make these angles with the first two wires—and we find 
that it will not lie flat on the table but sticks up at an angle, 
and its negative half has to go through the table and stick 
out below it. If we solder the three wires together where 
they cross (at the point representing the man who gets 
the average score in each of the three tests) and pick them 
up, they form a double tripod. 

7. Two kinds of space——It will be seen that we have 
described two geometrical ways of representing correlation 
using two different spaces. In the one kind of space, the 
test lines are at right angles to one another, or orthogonal, 
and the presence of correlation is shown by the fact that 
the swarm of dots representing persons is not spherical but 
ellipsoidal. 

In the other kind of space, the crowd of dots representing 
persons is spherical and the presence of correlation is 
shown by the test lines not being orthogonal but at angles 
with one another whose cosines equal the correlation 
coefficients. 

In both kinds of space, a person's scores in the tests are 
found by dropping perpendiculars from his point on to the 
test lines. The distances of the feet of these perpendiculars 
from the origin—that is, from the point where the test lines 
cross—are his scores in the tests. 

If the test lines in this second kind of space are swivelled 
back into orthogonality, the person-points will move, will 
cease to be spherical in contour, and become ellipsoidal. 
All this is true, not only for three-dimensional Space, when 
we have only three tests, but for multi-dimensional space 
needed to represent many tests and their inter-correlations. 
The algebra is exactly the same for any number of dimen- 
sions, and we continue, in the larger spaces, to use by 
analogy the terms we are accustomed t9 in real space, such 
as sphere, ellipsoid, ete. C 
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8. 4 still larger space.—Another way of arriving at the 
second of the above two kinds of space—the spherical one, 
in which the cosines equal the correlation coefficients—is 
to begin with a much larger space, of as many dimensions 
as there are persons, who are therein represented by 
orthogonal axes. If along each person's axis we set off 
the score he gets in a given test, say Test 1, these abscisse 
will define a point in the space representing that test. In 
the same way each test can be represented by a point. It 
is a seatter-diagram with the usual réles of tests and persons 
exchanged. 

These test points will usually be much less numerous than 
the persons, and they define a sub-space of dimensions 
equal to the number of tests. This sub-space, if the test 
scores have been normalized,* is the same as our spherical 
space, and the lines joining the origin to the test points are 
our former lines, separated by angles whose cosines equal 
the correlation coefficients. 

9. Factor aves.—The problem of factorial analysis is to 
decide upon a set of axes to use in the space in which the 
test lines exist. Let us explain 
this first of all in the simplest case, 
that of two tests, represented by 
their lines in a plane, at the angle 
corresponding to their correlation. 

In this case, the most natural 
way of drawing orthogonal axes 
on the paper is to place one of 
them (see Figure 17) half-way 
between the test vectors, and the Figure 17. 
other, of course, at right angles to 4 
the first. Of these two factor axes, OA is as near as it can 
be to both test lines. 

We pictured, before, a swarm of ten thousand dots on 
the paper, each representing a person by his scores in the 
two tests, found by dropping perpendiculars from his dot 
to the two vectors. Instead of describing each point (each 
person, that is) by the two test scores, it is clear that we 
could describe it by the two factor scores—the feet of 


** See footnote, page 6. 
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perpendiculars on to the factor axes. It is also clear 
that, as far as this purpose goes, we might have taken 
our factor axes anywhere, and not necessarily in the posi- 
tions OA and OB, provided they went through the point O 
and were at right angles. In other words, we can “ rotate ” 
OA and OB round the point O, and any position is equally 
good for describing the crowd of persons. Either of the 
tests, indeed, might be made one of the factors. The 
positions shown in Figure 17 are advantageous only if we 
want to use only one of our factors and discard the other, 
in which case obviously OA is the one to keep, as it lies 
as near as possible to both test axes. The scores along OA 
are the best possible single description of the two test 
results. 


10. Spearman axes for two lests.—The orthogonal axes 
chosen by Spearman for his factors are, however, none of 
the positions to which OA and OB can be rotated in the 
plane of the paper. Besides, Spearman has three factors, 
and therefore three axes, for two tests, namely the general 


factors, and we cannot have 


, lie in three- 
» like the three lines which meet in the 
corner ofa room. If we rotate the OA and OB of Figure 17 
out of the plane of the paper (say, pushing 4 below the 
surface of the paper, and, say, raising B above it), we shall 
clearly have to add a third axis, at right angles to 0.4 and 
OB, to enable us to describe the tests and the persons who 


11. Spearman azes for four tests. 
depicting three dimensions on a 
So we can, in Figure 18, represent 
and s, for two tests. And since 
other dimensions, by means of pe 


— We are accustomed to 
flat sheet of paper, and 
the Spearman &Xes g, s, 
we have begun to depict 
rspective, on a flat sheet, 
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let us continue the process and by a kind of super-per- 
spective imagine that the lines s4, s,, and any others we 
may care to add, represent axes sticking out into a fourth, 
a fifth, and higher dimensions. Figure 18 thus represents 
the five Spearman axes for four tests, of which only the 
line of the first test is shown (in its positive half only). 

All the five lines g, Sı, S» 53, and s, must be imagined as 
being each at right angles to all 
the others in five-dimensional 
space. The line of Test 1, shown 
in the diagram, lies in the plane 
or wall edged by g and s. It 
forms acute angles with g and 
with s, the cosines of which 
angles are its saturations with g 
and s, respectively. If it had 
been highly saturated with g, it 
would have leaned nearer to g Figure 18. 
and farther away from s. 

The other three axes, s» $3, and s4, are all at right angles 
to the wall or plane in which Test 1 lies. They have, 
therefore, no correlation with Test 1, no share in its 
composition. Test line 2 similarly lies in the wall edged 
by g and s, test line 3 in that edged by g and sy. The 
axis g forms a common edge to all these planes. If the 
battery of tests is hierarchical—that is, if the tetrad- 
differences are all zero—then all the tests of the battery 
can be depicted in this way, each in its own plane at right 
angles to all the other planes, no test line being in the 
spaces between the ** walls.” 

The four test lines themselves, of course, are only in 
a four-dimensional space (a 4-space we shall say, for 
brevity). Just as, when we were discussing Figure 17, we 
said that Spearman used three axes which were all out of 
the plane of the paper, so here in Figure 18, with four test 
lines (only one shown) in a 4-space, Spearman uses five 
axes in a space of one dimension higher than the number 
of tests. For n hierarchical tests, Spearman's factors are 
in an (n + 1)-space. 

If along each test line we measure the same distance 
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as a unit, then perpendiculars from these points* on to the 
g axis will give the saturations of the tests with g as fractions 
of this unit distance. The four dots on the g axis in Figure 
18 may thus be taken as representing the test vectors T 
projected on to the “ common-factor space,” which is here 
a line, a space of one dimension only. Thurstone’s system 
is like Spearman’s except that the common-factor space is 
of more dimensions, as many as there are common factors. 
Figure 19 shows the Thurstone axes for four tests whose 
matrix of correlation coefficients can be reduced to rank 2. 

12. A common-factor space of two dimensions.—Here there 
are two common factors, a and b, and four specifies, Sy, 
Sg, $3, and s, All the six axes representing these factors 
in the figure are to be imagined as existing in a 6-space, 


each at right angles to all the 
others. "The common-factor 
space is here two-dimensional, 
the plane or wall edged by a 
and b—to make it stand out 
in the figure, a door and a 
window have been sketched 
upon it. 

In Spearman’s Figure 18, 
each test line lay in a plane 
Biss AG defined by g and one of the 

specific axes. Here in Figure 
19, each test line lies in a different 3-space. These different 
3-spaces have nothing in common with one another except 
the plane ab, the wall with the door and window in the 
diagram. In Figure 18 the projections of the unit test 
vectors on to the common-factor Space were lines which all 
coincided in direction (though they were of different 
lengths), for there the common-factor Space was a line. 
Here the common-factor space is a plane, and the pro- 
jections of the four test vectors on to that plane are shown 


* These points are then the same 
described in Section 8 (page 99). 


T A vector is a direction with a magnitude, and now that we have 
measured unit distance along each test line, we may speak of unit 
test vectors, 


as those arrived at by the process 
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in the figure by the numbered lines on the “ wall.” These 
lines, if they are all projections of vectors of unit length, 
will by their lengths on the wall represent the square roots 
of the communalities. 

13. The common-factor space in general.—When there are 
r common factors, the common-factor space is of r dimen- 
sions, and the whole factor space (including the specifics) is 
of (n + r) dimensions. The test vectors themselves are in an 
n-space ; their projections on to the common-factor space 
are crowded into an 7-space, and are naturally at smaller 
angles with one another than the actual test vectors are. 
These angles between the projected test vectors do not, 
therefore, represent by their cosines the correlations be- 
tween the tests. The angles are too small for that, and 
the cosines, therefore, too large. But if we multiply the 
cosine of such an angle by the lengths of the two projections 
which it lies between, we again arrive at the correlation. 

Thus in Figure 19, the angle between the lines 1 and 3 
on the wall is less than the angle between the actual test 
vectors 1 and 3 out in the 6-space, of which the lines on 
the wall are the projections. But the lengths of the lines 1 
and 3 on the wall are less than the unit length we marked 
off on the actual vectors, being, in fact, the roots of the com- 
munalities. If we call these lengths on the wall h, and hg, 
then the product Ah times the cosine of the projected 
angle again gives the correlation coefficient. 

14. Rotations.—It will be remembered that Thurstone, 
after obtaining a set of loadings for the common factors 
by his method of analysis of the matrix of correlations, 
“rotates ” the axes until the loadings are all positive— 
and he also likes to make as many of them as possible zero. 
It is instructive to look at this procedure in the light of our 
geometrical picture from which the phrase “ rotating the 
factors" is taken. It should be emphasized first of all 
that such rotation of the common-factor axes in Thur- 
stone’s system must take place entirely within the com- 
mon-factor space, and the common-factor axes must not 
leave that space and encroach upon the specifics. In 
Figure 18, therefore, no rotation, in Thurstone’s sense, of 
the g axis can be made (since the common-factor space is a 
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line), except, indeed, reversing its direction and measuring 
stupidity instead of intelligence. s 

In Figure 19 the common-factor space is a plane, and 
the axes a and b can be rotated in this plane, like the hands 
of a clock fixed permanently at right angles to one another. 
When the positive directions of a and b enclose all the 
vector projections, as they do in our figure, then all the 
loadings are positive. The position shown would, there- 
fore, fulfil this desire of Thurstone’s. Moreover, one of 
the loadings could be made zero, by rotating a and b until 
4 coincides with line 1 (when b will have no loading in 
Test 1), or until b coincides with line 4 (when a will have 
no loading in Test 4). 

When there are three common factors, the common- 
factor space is an ordinary 3-space. The three common- 
factor axes divide this space into eight octants. Rotating 
them until all the loadings are positive means until all the 
projections of the test vectors are within the positive 
octant. "This will always be nearly possible if the corre- 
lations are all positive. Moreover, it is clear that we can 
always make at any rate some loadings zero. In the 
common-factor 3-space we can move one of the 
it is at right angles to two of the test 
tests that factor will then have no lo 
axis fixed, we c 
seeking for a p 


axes until 
projections, in which 


ading. Keeping that 
an then rotate the other two axes round it, 


osition where one of them is at right angles 
to some test. The number of zero loadings obtainable 
will clearly be limited unless the configuration of the test 
vectors happens to lend itself to many zeros. We shall see 
later that Thurstone seeks for teams of tests which do this. 
Although Thurstone makes his rotations exclusively 
within the common-factor Space, keeping the specifics 
sacrosanct at their maximum variance, there is, of course, 
nothing to prevent anyone who does not hold his views 
from rotating the common-factor axes into a wider space, 
and increasing the number of common-factor axes at the 
expense of the specific variance,until ultimately we reach as 
many common factors as we have tests, and no specifics. 
15. The geometrical picture of centroid analysis —Think 
of a sheaf of lines Tepresenting a number of tests, with 
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angles corresponding to the correlations. Centroid analysis 
means (if unities are used in the diagonal cells) finding a 
line in the middle of this sheaf—at the centroid or resultant 
—something like the stick in the middle of the ribs of a 
slightly opened umbrella, except that our test lines are not 
regularly spaced like thoseribs. ~ 

All this is in a space of as 
many dimensions as there are 
tests, and it is not possible to 
make a drawing. But if the 
reader will be tolerant, we 
can make one of our “ super- 
perspective " drawings show- 
ing a sheaf of test lines (see 
Figure 20) which must be 
imagined as being in a multi- 
dimensional space. The cen- 
troid line OC is the line along 
which the point O would move 
if each test line were a force 
—all equal—pulling O. Tt is 
exactly like the parallelogram 
of forces on a multi-dimen- 
sional scale. The dots on the 
test lines are at unit distance 
from O. (They have been 
joined by lines only in order 
to make the figure look more 
solid.) The loadings of the Higurei2; 
tests in the first centroid : 
factor are the projections of these unit distances on to oc— 
this is when unities are used in the diagonal cells. The 
summation process gives, arithmetically, these projected 
distances along OC. 

The next part of the arithmetical process consisted in 
removing that part of the correlation coefficients explained 
by the first factor loadings. This means, in our space 
diagram, that the dimension parallel to OC is abolished, 
and all the test lines are projected on to a space at right 
angles to OC and of one dimension less than the original, 


F.A.—4* 
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(n — 1) dimensions instead of n, if n be the number of 
tests. 

We have had perforce to draw our diagram as though 
it were in a three-fold instead of an n-fold space : and for 
this new (n — 1)-fold space we have drawn an ordinary 
plane, like a drawing-board, at right angles to 0C, and 
projected the five test lines on to it. "The next thing is to 
find the centroid of these five directed lines, these vectors, 
on the drawing-board. But we find at once that they are 
in equilibrium. If they were forces, the point O would 
not move. That is because OC is indeed the centroid of 
the original lines. This fact of equilibrium corresponds to 
the fact that the columns of residues add up to zero. 

To get over this, in the arithmetic, we changed the signs 
of some rows and corresponding columns, till, if possible, 
all cells were positive. (These cells of the residues are the 
cosines of the angles on the drawing-board, some of which 
are clearly obtuse, with negative cosines) This reversal of 
signs in the arithmetic corresponds, in our 
reversing some of the vectors on 
they again form a sheaf, as close as possible. Two are 
shown as reversed in our figure, and most of the angles are 
now acute, most of the cosines positive. It is desirable to 
make the sheaf as compact as possible, 
making as many cells positive 

The centroid of the resulting 
is the second factor. 


diagram, to 
the drawing-board, till 


corresponding to 
as possible, 

sheaf of vectors (or forces) 
Its dimension is next abolished, 
by projection on to a space of (n — 2) dimensions, and so 
on, and so on. Our possibility of following this in 
ing is beyond delineation, but if the reader will in i 
tion conceive of our first she 
dimensions, and being step b 
of (n — 1), (n — 2) and lesser dimensions, he w 
picture corresponding to the arithmetical summ. 
cess and the sign reversals in the residues, 

For simplicity we have above Supposed that unities 
were being left in the diagonal cells, in which case as many 
common factors would emerge as there were tests, and 
there would be no specifics. If communalities are inserted 
and the rank of the matrix of correlations reduced, there 


a draw- 
magina- 
af of test lines being in n 
y step projected on to Spaces 
ill have a 
ation pro- 
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will be fewer common factors. Our diagram would then 
be in the common-factor space and, indeed, can still serve, 
if we suppose the distances from O to the dots on the test 
lines to be not unity, but the square roots of the commun- 
alities, and the angles to be the projections of those between 
the full test lines. With that change, our diagram would 
be one for the communal parts of five tests with three 
common factors, represented by OC, by the resultant of 
the vectors on the drawing-board (after the reversals to 
destroy the equilibrium), and by a third line also on the 
drawing-board, at right angles again. 

16. Principal components.—The object of using centroids 
as axes in the above process is to obtain axes in diminishing 
order of importance as describers of the test lines. In the 
current jargon, they each “ take out ”? as much variance as 
possible at each step—or rather, not quite as much as 
possible, though nearly so. There is another set of lines 
which actually do take out as much as possible. They are 
the lines corresponding to the axes of the ellipsoid of 
Figure 15, or the more general ellipsoids of higher dimen- - 
sions. The centroid OC in our Figure 20 is in such a 
position that the sum of the squares of the vertical distances 
of the test dots to it is very small, nearly as small as 
possible. Another line, however, quite close to OC and 
corresponding to the major axis of the ellipsoid, makes this 
sum of squares an absolute minimum, and the sum of 
squares of the loadings of the factor a máximum. 

In Section 5 above we spoke of converting the ellipsoid 
of our Figure 15 into a sphere by swivelling the three test 
lines nearer to each other till the cosines of their angles 
correspond to the correlation coefficients, and the test lines 
take up positions such as they have in our Figure 20. 
When this is done, the major axis of the ellipsoid takes up 


a position among the test lines, quite near to the centroid 


but not quite coinciding, and with the property of maxi- 
mizing the “ variance taken out." Similarly, the other 
principal axes of the ellipsoid, when the change is made in 
the space, replace for the better the later centroids of the 
simpler process. The arithmetical method of calculating 


their loadings is explained in our next chapter. 


CHAPTER VII 
PRINCIPAL COMPONENTS 


1. A historical accident.—By a historical accident; the 
method of principal components is associated in the minds 
of psychologists with analyses in which unities, and not 
communalities, are used in the diagonal cells of the square 
table of correlations. The centroid method can, however, 
equally well be used on such a table, giving the centroids of 
the complete test vectors in the whole test space: and the 
principal components of the communality vectors, in the 
common-factor space, can be found, using communalities in 
the diagonal cells, by the same iterative process as we are _ 
about to describe. As, however, this method was originally 
used on unit entries, we shall first make a principal com- 
ponents analysis of the whole tests of the example already 
used for the centroid process. Later we shall analyse the 
communality vectors by the same process (page 118). 

2. A calculation.—The actual calculation of the loadings 
of principal components requires, for its complete under- 


standing, a grasp of the method of finding algebraically the 
1-0 4 E 2 8 “78 UU 
E 1:0 erf “3 1-0 1-00 1-000 
E "U 1-0 3 1:0 1-00 1:000 
2 B 3 1:0 Gri 65 637 
80 32 :32 16 
40 1-00 “70 30 
40 “70 1:00 30 
14 :21 :21 "70 
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principal axes of an ellipsoid, a problem which will be 
found dealt with in three dimensions in any text-book on 
solid geometry. We give an account of this, for n dimen- 
sions, in the Appendix. Here we shall only explain 
Hotelling’s (1933) ingenious iterative method of doing this 
arithmetically, by means of an example, for which we shall 
use the matrix of correlations already employed in Chapter 
V to illustrate the centroid method (see page 108). 

Hotelling’s arithmetical process then begins with a guess 
at the proportionate loadings of the first principal com- 
ponent. Practically any guess will do—a bad guess will 
only make the arithmetic longer. We have guessed :8, 1, 
1, 7, the numbers to be seen on the right of the matrix, 
because these numbers are roughly proportional to the 
sums of the four columns, and such numbers usually give 
à good first guess. 

Each row of the matrix is then multiplied by the guessed 
number on its right, giving the matrix below the first one, 
beginning with -80. We then take, as our second guess, 
numbers proportional to the sums of the columns of this 
matrix,* namely— 

1-74 2-23 2:28 1:46 
giving ‘78 1 1 (65. 
That is, we divide the sums of the columns by their largest 
member, and use the results as new multipliers. They 
are seen placed farther on the right of the original matrix. 
It is unusual for two of them to be of the same size—that 
is a peculiarity of our example. 

It is always the original matrix whose rows are multiplied 
by each improved set of multipliers. The above set gives 
the next matrix shown, that beginning with -780, and the 
sums of its columns— 


1-710 2-207 2-207 1-406 
give a third guess at the multipliers, namely— 
775 1 i :637 


* When a calculating machine is being used, this matrix will not 
be actually written down—the column sums will be arrived at on the 
machine. 
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And so the reiteration goes on, and the reader, who is 
advised to carry it a stage farther at least, would find if he 
persevered that the multipliers would change less and less. 
If he went on long enough, he would reach this point 
(usually, however, far fewer decimals are sufficient) : 


1:0 E E 2 7712865 
E 1:0 7 3 1-000000 
+A 7 1:0 3 1-000000 
2 43 3 1:0 -629811 


“772865 — -309146 — -309146 -154573 | j 
"400000 1-000000 -700000  -300000 | Ror A 

-400000 -700000 1-000000 -300000 
1125962 -188943 -188943 -629811 


1-698827 2-198089 2-198089 
giving «772805 1 1 


m 


384384 
-629813 


that is, totals in exactly the same proportion as the multi- 
pliers. These final multipliers (or earlier ones if the experi- 
menter is content with less exact values) are then propor- 
tionate to the loadings of the first principal component in 
the four tests. They have, however, to be reduced until 


the sum of their squares equals the largest total, 2-198089, 
which is called the first 


l “latent root" of the original 
matrix. This is done by EIU ea n e eaters cop 
of the sum of their squares and multiplying them by the 
Square root of the latent root. They then become— 


:662 '857 '857 :540 


The next step in Hotelling's process is similar to one 
with which we have already become familiar in Thur- 
stone's method. The parts of the variances and correla- 


tions due to this first component are calculated and sub- 
tracted from the original experimental matrix. 
variances and correlations due to 


are shown at the top of the opposite 
The residual matrix is then treated 
way as the original matrix, the begin 
being shown opposite. 
sign-changing. The g 
the sums of the colu 


These 
the first component 
page. 

in exactly the same 
I nings of the process 
There is no need, in this process, for 
guessed multipliers, proportional to 
Inns, are not so near the truth this 
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+662 857 857 -540 


439 -567 -567 -857 


662 

.g57 | 567 84 84 -462/ Matrix due to 
“857 567-784 T34 — -462 first principal 
540 | 857 462 — 462 — 201 component. 

| -561 — 107 :—- :3 18 

Residual) — -167 -266 4 — -38 

matrix | — 167 — :034 — 4 — 38 

1:0 100 


— 050 — 047 

013 065 
— 106 -065 
— 162 “709 


145 — “305 — 305 7792 


time, for the first one, which we have guessed at :3, and 
which reduces after one operation to -18, goes on reducing 
until it becomes negative, the final values of these second 
loadings being as shown in the appropriate column of the 
following table, which also gives the loadings of the third 
and fourth factors, obtained in the same way. The vari- 
ances and correlations due to each factor in turn are 
subtracted from the preceding residual matrix and the new 
residual matrix analysed for the next factor : 


| | | | Sum of 


a | | | n 
T al | i : | a | 1 La a Squares 
| | 
Testi | -662218 | — -823324| -675967 | . 1 
» 2 | 1856836 | .. 435197|— ‘312332 | — 887298 1 
» 3 .856836 | — -135197| — -312332 | -387298| 1 
»4 :539645 -826092| 162323 | . 1 
Sum of | 


squares * | 2-198090 | -823526 .678383 | -300000 4& 
| 


Percentages | 55-0 20-6 169 | 7:5 | 100 


* These four quantities are, in the Hotelling process, what are 
called the “latent roots " of the matrix. Their product gives the 
value, -3684, of the determinant of the matrix of correlation co- 


efficients. 
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An alternative method of finding principal components, 
due to Kelley, is to deal with the variables two at a time. 
The pair first chosen are rotated in their plane until they 
are uncorrelated. Then the same is done to another pair, 
and so on, the new uncorrelated variables being in turn 
paired with others, until finally all correlations are zero. 
(Kelley, 1935, Chapters I and VL) A chief advantage is 
that the components are obtained pari passu, and not 
successively ; also, in certain circumstances where Hotel- 
ling's process converges very slowly, Kelley’s is quicker. 
The end-results are the same. 

8. Acceleration by powering the matriv.—In a later paper 
Hotelling pointed out that his process of finding the load- 
ings of the principal components can be much expedited 
by analysing, not the matrix of correlations itself, but its 
square, or fourth, eighth, or sixteenth power, got by 
repeated squaring (Hotelling, 1935b). Squaring a sym- 
metrical matrix is a special case of matrix multiplication 
(see Chapter X, Section 4, page 145) : it is done by finding 
the “ inner products ” (see footnote, page 7 4) of each pair of 
rows, including each row with itself, and setting the 


results down: in order. Applying this to the correlation 
matrix : ` 


1-0 4 E 2 
E 1:0 Dri S 
4 1:0 3 
2 3B E 1:0 


we see that the inner 


produet of the first row with itself 
is 1:36; of the first r 


ow with the second, 1-14 


c , ; and so 
on. Setting these down in order, we get for the matrix 
squared : 

1:36 1-14 1:14 64 

1-14 1-74 1-65 “89 


114  L65 174 
64 -89 89 122 
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Exactly the same process is applied to this, beginning 
with guessed multipliers, as we applied to the original 
matrix. The multipliers, however, settle down twice as 
rapidly towards their final values, which are the same here 
as there. We have finally : 


1:36 134 1-14 64 712865 
1-14 1-74 1:65 -89 1-000000 
1:14. 1:65 1:74. “89 | d -000000 

“64 “89 89 1-22 | -629811 


1-051096 -881066 -881066 -494634 | 
1340000 1-740000 1-650000 *890000 | 
1-140000 1-650000 1-740000 -890000 | 

-403079 — -560532 -560532 -768369 | 


3-734175 4831598 4831598 3-043003 | 


Ratio -772865 1-000000 1-000000 :629812 


The * latent root,” however, or largest total, 4-831598, is 
the square of the former latent root, 2-198090, so that its 
Square root must be taken before we complete finding the 
loadings. 

In exactly the same way the squ 
again squared, and again and again, 
The more we square it, the quicker the H 
process works. The end multipliers are always the same, 
but the “ root ” is the same power of the root we need as 
is the matrix of the original matrix. 

A still further acceleration of the process is due to Cyril 
Burt, who observed that as the matrix is repeatedly 
Squared it becomes more and more nearly hierarchical, 
including the diagonal cells (Burt, 1937a). This is due 
to the largest factor increasingly predominating as it is 


“ powered,” especially if the largest latent root is widely 


Separated from the others. In consequence, the square 
roots of the diagonal cells become more and more nearly 
in the ratio of the Hotelling multipliers, and form an 
excellent first guess for the latter. When our matrix 
is squared twice again, giving the eighth power, it 


becomes : 


ared matrix may be 
before we analyse it. 
otelling iteration 
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| 10878 140-67 140-67 88-54 
| 140-67 182-03 182-03 114-61 

140-67 182-03 ` 182.03 114-61 
| 88:54 114-61 114-61 72-38 


and the square roots of its diagonal members arc — 
10-429 13:492 13-492 8-508 
which are in the ratio— 
‘7730 1 1 -6306 
very near indeed to the Hotelling final multipliers— 
"772865 1 x 1 “629811 


Hotelling gives a method of finding the residues, for the 
purpose of calculating the next factor loadings, from the 
“ powered " matrix. But it may be so nearly perfectly 
hierarchical that this fails unless an enormous number of 
decimals have been retained, and it is in practice best to 


go back to th original matrix and obtain the residues 
from it. Their matrix can in tur 


—If all the principal com- 
ponents are calculated accurately, and if unities were used 


in the diagonal cells, their loadings ought completely to 
exhaust the variance of each test; that is, the sum of the 
Squares of the loadings in each TOW should be unity. The 
sum of the squares of the loadings in each column equals 


the “latent root ” corresponding to that column, and the 
sum of the four lat 


Oftests. Each lat 
variance of all the te: 
that factor, Thus t 


c s the part of the whole 
Sts which has been “ taken out ” by 
he first factor ** 
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* inner product " of each pair of rows. Applying this to 
the table we find the correlation 74, say, to be— 


856836 x -539645 — -185197 x -826092 — -312332 
x 162323 — -387298 X zero = -300000 


In this way the loadings of the four principal com- 
ponents will exactly reproduce the correlations we began 
with. If, however, we have stopped the analysis after we 
have found only two principal components (or factors), 
these two would have reproduced the correlations only 
approximately. For example, for rą we should only 
have— 


856836 x -539645 — -135197 x -826092 
= -3850702 instead of -300000 


Before we leave the table of loadings, we may note that 
the signs of any column of the loadings can be reversed 
without changing either the variances or the correlations. 
Reversing the signs in a column merely means that we 
measure that factor from the opposite end, as we might 
rank people either for intelligence or stupidity and get the 
same order, but reversed. We will usually desire to call 
that direction of a factor positive which most conforms 
with the positive direction of the tests themselves, and 
therefore we will usually make the largest loading in each 
column positive. 

All the loadings of the first principal factor are, in an 
ordinary set of tests, positive. Of the other loadings, 
about half are negative. 

5. Calculation of a man’s principal components. — 
Factors obtained by using unities, and not communalities, 
in the diagonal cells have an important advantage. They 
can be calculated exactly from a man’s scores, whereas 
communality factors can only be estimated. This is 
because the former are never more numerous than the tests, 
whereas the latter, including the specifics, are always more 
numerous than the tests. For the former, therefore, we 
always have just the same number of equations as un- 
knowns, whereas we have more unknowns than equations 


When communalities are used. . 
We have hitherto given the analysis of tests into factors 
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in the form of tables of loadings. But we can alternatively 
write them out as “ specification equations,” as we shall 
callthem. Thus the table on page 111 would be written— 


= *662218y, — -323324», -+ -075907», 

Za = :856836y, — -135197y, — -312332», — -387298y, 
= :856836y, — 185197), — -812832y, + -387298y, 
= 1989645, + 8200925, + -162323y, 


Here %, 2, 23, and 7, stand for the scores in the four 
tests, measured in standard units ; that is, measured from 
the mean in units of standard deviation. The factors 
Yi Y» Ya, and y, are also supposed to be measured in such 
units. "These specification equations enable us to calculate 
any man’s standard score in each test if we know his 
factors, and since there are just as many equations as 
factors, they can be solved for the y’s and enable us to 
calculate, conversely, any man’s factors if we know his 
Scores in the tests. The solution to these Hotelling equa- 
tions for the y’s happens to be peculiarly simple, as we 
shall prove in the Appendix, Section 7, It is as follows— 


% = (  +662218z, + -8568362, + 78568362, + -5396452,) + 2-198090 
Ya = (— :8238242, — 1351972, — 1351972, + -826092z,) + 823526 
%=( -675967z, — -3123322, — 79123322, + -162323z,) — -678383 
4 =( — 18872982, + -3872987, ) + -300000 


The table on page 111, therefore, 
Read horizontally it gives the co: 
terms of factors. Read vertically it gives the composition 
of each factor in terms of tests, if we divide the result by 
the root at the foot of the column.* 


Suppose, for example, that a man or child has the fol- 
lowing scores in the four tests— 


serves a double purpose. 
mposition of each test in 


1-29 36 “712 1-08 


This is evidently a person aboy 


e the &verage in each test, 
Since the scores are 


all positive. His factors will be 


* If the anal rcliabilities in the 
diagonal cells instead of units, the statement in the text still holds 
(Hotelling, 1933, 498). If on correlations corrected for “ attenua- 

is e complicated (ibid. 499—502). 
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obtained by substituting these scores for the z's in the 
above equations, with the result— 
yı = 1-062504 


ya = "849441 
ya = 1034624 
ya = "464757 


x decimal places would be 
because we are using this 
etical points, in place 
and they need, there- 


(Of course, in practical work si 
absurd. They are given here 
artificial example to illustrate theor 
of doing algebraic transformations, 
fore, to be exact.) i 

If these values for the factors are now inserted in the 
specification equations opposite, the scores * in the test 
will be reproduced exactly (1:29; .36, -72, and 1-08). 

Notice, too, that if we have stopped our analysis at less 
than the full number of principal components using unities 
in the diagonal cells, we can nevertheless calculate these 
factors for any person exactly. As soon as we have the 
first column of the table on page 111, we can calculate y; for 
anyone whose scores z we know. 

Had we done this with the person whose scores are given 
above, we should have summarized his ability in these four 
tests by the one statement— 

yı = 1:062504 
cen an incomplete statement, 


is the best single statement that can be arrived at. 
the common-factor Space. — 


6. Principal components in : 
ss for finding the 


Exactly the same iterative Hotelling proce t 
] axes, of the ellipsoids 


principal components, the principa 

of density of the person-points can be applied to the table 

of correlations with communalities in the diagonal cells. 
in the full test space 


The ellipsoidal swarm of person-points, t SU sp! 
with orthogonal axes for the tests, remains an ellipsoidal 


Swarm (though one of fewer dimensions) when projected 
on to the common-factor space. The mathematical reader 
will know this, or can work it out. The non-mathematical 
reader knows it well enough in the number of dimensions 
he is personally acquainted with: e.g. an egg» which is an 
ellipsoid of three dimensions, throws a shadow on à wall 


This would have b but it 
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which is an ellipse, i.e. an ellipsoid of two dimensions. : We 
shall now analyse the same set of correlation coefficients 
using the communalities -26, -7, -7, -15, which we Enos 
from Chapter V, page 77, reduce the rank of the matrix 
to two, and give an analysis with only two common factors. 
We found on page 78 the two centroid common factors. 


We shall now find the two principal components and find 
them very similar. 


7. Calculation with communalities 
:2667 4 E : | 


dni Ge) :5913 
4 Si) A AE n- 1 

4 7 oY 38 yal 1 

2 E T3 55 d ET 


1:0867 1:83 1-83 -815 | 


Taking -7, 1, 1, -5 as a first guess at the multipliers, we find 
the weighted sums of the columns to be as shown, and on 
dividing through by 1-83 we get the next set of multipliers 
:59, 1, 1, -45. Continuing in this way, we arrive quite 
Soon at -5918, 1, 1, -4435, which, when used as weights, 
reproduce themselves. When reduced until the sum of 


their squares equals 1:7696 (the largest column total with 
these weights), the loadings are— 


:4929 :8836 :8336 :3697 


Subtracting the cross-products of these from the original 
matrix, and operating on the residues in exactly the same 
iterative way, we get for the second factor loadings— 


:1540 — :0712 — :0712 1158, and no residues.* 


If we compare these principal component loadings with the 
centroid loadings (page 78) obtained with the same com- 
munalities, we see that they are very similar. But the 
sum of squares of the loadings of the first prinei 
ponent (1-7694) is slightly 1 
first centroid loadings ( 


pal com- 
arger than the same sum for thé 


157652). The principal compon- 
* The sums of squares of the loai 
two first latent roots of the matrix 


two latent roots are zero. The s 
sum of the communalities, the ** 


dings (1:7694 and 0471) are the 
with communalities. The other 
um of the latent roots equals the 
trace ” of the matrix as it is called. 
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ents take out at each stage the maximum possible variance 
(sum of squares of loadings). The centroids nearly do 
so if the sign-changing is carefully done, but not quite. 
The centroids can best be looked at as approximations to the 
principal components, more easily calculated. In a bat- 
tery of many tests, say two dozen, and with any given 
communalities, the principal component process (s weighted 
summation ”) will take out more variance in, say, SIX 
factors, and leave smaller residues, than will centroid 
factors.* But with the kind of data available in psychology, 
this advantage does not outweigh the disadvantage of 
longer calculation. 
8. Iterative methods.—Both in the above Hotelling cal- 
culation, and in our discussion of communalities on page 
88, we have seen examples of iterative processes, where a 
first guess at certain constants gives results which can be 
used as a better guess, which gives results which can be 
used as a still better guess, which gives - -- and so on 
and so on, until the stage is reached where the same con” 
stants emerge as were put in. This sort of process where 
repetition after repetition converges t sult 
lue to some uantity, 


i z m 
i ving some maximum or minimum va t 
$ not uncommon in mathematics an js rather mysterious 


and magical to the layman. An analogy will perhaps 
nost understanding, Robinson Crusoe wants to make a 
lathe, but he has no wheels and spindles, and to make 
Wheels and spindles he needs a lathe! He can, however; 
whittle crude makeshift wooden wheels, etc., with a knife, 
and make a erude lathe with them, with which lathe he can. 
make somewhat better wheels and therefore à somewhat 
better lathe, with which he can make still better wheels 
* + . and so on, till he reaches perfection. 


d with guessed communalities, and 


* If the Hotelli ids 
tell: ss is use 
PUE e with centroids on page 88) the 


the F 

whole is ite: sd 
rated (as was don! Gan : 

communalities will converge to a set minimizing the sum of squares 


of the resid i f factors The maximum likeli- 
uals for a given number © ac i ej 

hood method of chapter IX arrives at communalities (1 understand 

from Dr. Lawley) which minimize à weighted sum of squares of the 

Tesiduals, each weight being the product of the reciprocals of the two 


Speci : 
pecific variances concerned. 


CHAPTER VIII 
TESTING RESIDUES FOR SIGNIFICANCE 


1. The object of factorial analysis.—As was said in the first 
section of Chapter I, the objects of factorial analysis are 
both practical and theoretical. The practical desire is to 
reduce the description of a man’s mind* to a comparatively 
few quantitative statements, instead of an unwieldy record 
of innumerable test scores, with a view to giving vocational 
or educational advice. The hope, on the theoretical side, 
is that the “ factors " found may form the structure of a 
theory of mind : and there are some who hope that physio- 
logical or neurological bases may be found for them. Our 
concern in this ehapter is with the first point: how lo 
reduce the number of “ factors ” without sacrificing any 
significant fraction of the information. The insertion of 
communalities in the diagonal cells of a table of correla- 
tions is by many looked upon as one way of doing this, 
since it reduces the number of common factors. Simul- 
taneously, however, it creates and maximizes the influence 
ascribed to specific factors, and the total number of factors 
is increased, not diminished. This will not be discussed 
in the present chapter, which is concerned with another 
way of reducing the number of factors, applicable whether 
communalities or full variances are analysed. If the idea 
of communalities and specifies had never occurred to any- 
one, it would still have been possible to reduce the number 
of significant common factors to a number less than the 
number of tests. Each principal component, found as 
described in Chapter VII, causes the remaining residues to 
be as small as can be: and the centroid factors of Chapter 
V are nearly as good, if the sign-changing is done properly. 
If, after a few such factors have been extracted, the 
residues are so small as to be Statistically negligible, we 


* Or of other objects of study, say in agriculture or in engineering. 
See Chapter XII, Section 7. 


120 


TESTING RESIDUES FOR SIGNIFICANCE 121 


might as well stop the analysis, content with the few factors 
extracted. We need, therefore, some test of statistical 
significance, applicable to such residual correlations, to 
know if they are negligible. 

.2. The general idea of significance.—The general prin- 
ciple of such a test of significance is this, that if the residues 
we have found, or in practice some function of them, could 
only rarely have been produced by the action of chance 
sampling, we will assume that they are not due to sampling 
but to another factor. How we define “ rarely ? depends 
on circumstances. Usually in psychology “ once in twenty 
times ” (the 5 per cent. point as it is called) is rare enough 
to justify taking out another factor. The principle is 
Straightforward enough, the mathematical difficulty of 
finding formule for calculating the chances, however, Very - 
great, even for principal components with full variances, 
and insuperable when the centroid method is used with 
guessed communalities, In consequences * number of 
rule-of-thumb criteria have been put forward, to decide 


When to stop factorizing. à 
ass Empirical rules for the number of factors Thurston’ 
x a, 65 et seq.) discusses some of the earlier pe as 
n which appeals to common sense 15 based simply on 

algebraic sum of the residuals (excluding the diagonal cells) 
erst as many as possible of their signs have been made 
Positive by the process described in Chapter V (page 71). 
\s long as this sum goes on sinking, factorization 1$ COD- 
tinued, When it flattens, the last factor taken out 1s 
rejected and the process stopped. Mosier (1939) found this 
We best of five plans he tried, though none was wholly 
Satisfactory. 

Ledyard Tucker’s criterion is that the ratio of the sums 
of the absolute values of the residuals, including the 
diagonal used, just after and just before the extraction of 


factor must be less than (” — 1)/(n + 1) where 7 18 em 
Number of tests. ti 

|. Coombs’ criterion depends upon the number of negative 
Signs left among the residuals after everything has An 
done to reduce them by sign-changing, in the EE. 
Process. If they are few, another factor may be extracted. 
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More exactly, the permissible number is given in this 
table : 


Number of tests 2081015150990: 1125. 380 
Negative signs . . 81 79 149 242 358 
Standard error . ero elo 128. 15 


A fuller table is given in Coombs’ article (1941). 4 

An example of the use of these two will be found in 
Blakey (1940, 126). 

Quinn McNemar (1942), who considers both of the 
above inadequate, gives a formula which includes N the 
size of the sample. He takes out factors until c, reaches 
or falls below 1/\/N, where 


9; =o, — (1 — My), 
S, = st. dev. of the residuals after s factors, 
M, = mean communality for s factors. 


Others go on until the distribution of the residuals 
ceases to be significantly skew (Swineford, 1941, 378). 
Reyburn and Taylor (1939) divide the residuals by the 
probable errors of the original coefficients, and plot a 
distribution of the results disregarding signs. If it is 
significantly different from a normal curve of the same area 
and with standard deviation 1-4825, they take out more 
factors. Swineford (1941, 377) finds the correlation 
between the original correlations and the corresponding 
residuals and takes out factors till it is not significant. 

Another method is based on the sinking of the factor 
loadings with each successive factor instead of on the dying 
away of the residuals. Guilford and Lacey (1947 in a 
U.S. Air Force report) stop factorizing when the product 
of the two highest factor-loadings falls below VVN. 

P. E. Vernon, in a privately circulated manuscript, has 
tested some two dozen methods, as applied when the 
centroid or simple summation method of analysis is used 
with communalities, on two analyses of actual data, on 
645 and 994 cases respectively (Vernon, 1947). His final 
advice is to use the methods of Guilford and Lacey (pro- 
duct of the two highest factor loadings) and of Mosier 
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(sum of : 
ferula ee together with Burt’s empirical 
e standard error of each factor loading— 


a — P)Yn 
VN(n — s +1) 


where | = F 
tests, 5 = painei y = number of persons, n = number of 
loadings 5 Pees number of the factor. If half the 
thus found " actor fall below twice their standard errors 
I Ee. Mae recommends rejection of the factor. 
proceed to Ee ee methods do not agree, Vernon would 
decide on the culate MeNemar's o, (opposite); and would 
factor if ahaa: ofthe four criteria, taking out another 
4. Mor $ 3 
coc o methods.—The earliest method was to 
Correlation c residue with the standard error of the original 
residues all oefficient and cease factorizing when the 
the use of ^ ank below twice these standard errors. But 
frowned i he formula for the standard error of r is now 
bution, upon because of the skewness of the distri- 
correlation coefficients, 
e further factors ; and 
top the analysis too 


More 

over . 

eing Dr sampling errors in the 

the ees correlated, produc 

Soon LE eee test tended to S 
on and Worcester, 1939). These further factors 


must þ 

e tak j 
of the delen out in order to give elbow room for rotation 
For xes to some psychologically significant position. 


th 
factors Seer factors are not concen 
Sually A5 SD out, but have been entangled with all. 
Expected ore factors have to be taken out than can be 
factors, Eum rotation, to yield meaningful psychological 
Ae rotatior an the dimensions are required nevertheless for 
Slons of th ns. In geometrical terms, some of the dimen- 
error, but e common factor space will be due to sampling 
not the particular dimensions indicated by the 


Greeti 
otelling: of the last factors to be extracted. In terms o 
y £'s plan, the whole ellipsoid is distorted ; its small 


major 

A ax 

lts large te are not necessarily due € , 
nesfreefromit. Aj?metho ibed by Wilson 


Wor 
oreester (1939, 139) which is, 
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when the number of tests is large. See also Burt (1940, 
888-40). Lawley (1940, 76 et seq.) repeated Wilson and 
Worcester’s criticism and developed an accurate criterion 
described in the next chapter. This is probably the best 
plan to usein any research where great accuracy is necessary. 
And it is for the case where communalities are employed. 
It is, however, only legitimate when the factor loadings 
have been found by Lawley’s application of the method of 
maximum likelihood. 

Principal components lend themselves to exact treat- 
ment when full unities are used, i.e. there are no specifics 
assumed. Hotelling himself (1933, 437-41) discusses the 
matter of the number which are significant. Davis (1945) 
shows how to find the reliability of each principal compon- 
ent from the reliabilities of the tests, and finds that it may 
happen that a later component is more reliable than an 
earlier one. 

5. M. S. Barilett’s test of significance for principal com- 
ponents.—Recently (Bartlett, 1950) a method has been 
described for deciding the significance of principal com- 
ponent factors which, while it is unlikely, in its present 
form at least, to be usable in any ordinary cases, ought 
to be briefly described here. It is highly desirable that 
exact methods, or methods where the assumptions made 
and the approximations permitted are clearly realized and 
set out, should gradually replace those based on experience 
only. Bartlett’s method depends upon the latent roots of 
the matrix of Correlation coefficients with unity in each 
diagonal cell—it is not applicable to communalities. 

Latent roots have been mentioned on page 111, where 
they appear as the sums of squares of the loadings of the 


tests in each principal component. In the example there 
used, their values are— 


^, = 2-198 
Ay = -824 
Ay = -678 
A, = -800 


They are equal in number to the tests, and their sum also 
is exactly 4. Bartlett forms quantities R; as follows : 
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1 | 
TUE Ne X =1 log. R 
9 2 
R, = (I) = -8506 — 0-16182 
Ms + M 
8 3 
R, = 2s —— =) = -7734 — 0-25696 
‘ Ao + da + da 
Ry = MSS = 3684 — 0:99858 


and of these we require the natural logarithms, which are 
2-3026 times the usual logarithms to the base ten. They 
are given above. These logarithms, multiplied by a certain ` 
coefficient, are an approximation to x? for the successive 
factors. The coefficient is— 
2p 4-5 , 2k 
m Dr DINE 


where n is the number of persons tested less one, p is the 
number of latent roots, i.e. of tests, and k is the number of 
factors already dealt with, i.e. it takes in turn the values 
(RUE Pb ao d 

In our example p = 4. 
persons tested was 20, so t 
make this table : 


If we assume that the number of 
hat Bartlett’s n = 19, we can 


| 5 Dei 
: 2 5 per cent. 
P er | x | ded 
0|3 2-1 | — 16-833 x (— :99858) = 16:8095 | 1299 
1 241| — 16167 x (— 25696) = 41542 | 7:82 
1 | — 15-500 x (— 16182) = 35082 | 3:84 


olumn are to be obtained from 
degrees of freedom 
ficant (16-8095 


TThe quantities in the last c 
a x? table, entered with the number of degi 
(d.f.) shown. Only the first factor is signi 


being greater than 12-59). 
If we had assumed 29 children (n = 28) we should have 


been puzzled by à peculiar result. The three values of y? 
are then 25-80, 6:47, and 3-96, so that it looks as though the 
first factor and the third factor are significant, with the 
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factor in between not significant!* But Bartlett warns 
(1950, 78) that this y? test is only valid if the roots 
already removed are significant. As soon as we come 
to a non-significant factor, the later factors are also non- 
significant. The last factor of all is not dealt with. 
“Merely the correlation structure of the variables is being 
investigated in its relation to variance," says Bartlett 
(page 80). “ For this reason no significance can ever be 
attached to the last root, for it would be equivalent to 
asking for the correlation structure of a single variable.’’t 


* Compare the report by Davis (1945) that a later component may 
be more reliable than an earlier one. 

T In a later paper (B.J.P. Statist. 4, p. 1) Bartlett warns that 
after one or more significant components have been eliminated it is 
safer to take as the number of degrees of freedom 
. 3 (p—k—1) (p—k4-2) 
instead of 

+ (p—k) (p—k—1) 
as used above. This would increase the degrees of freedom in the 
second line of the analysis on page 125 from 8 to 5, and in the 
third line from 1 to 2, and raise the 5 per cent. level. 


CHAPTER IX 


THE MAXIMUM LIKELIHOOD METHOD OF 
ESTIMATING FACTOR LOADINGS * 


(by D. N. Lawley) 


1. Basis of statistical esttmation.—In recent times attempts 
have been made to introduce into factorial analysis statis- 
tical methods developed in other fields of research. In 
particular the method of statistical estimation put forward 
by Fisher (1921, page 323 et seq.), and termed the method of 
maximum likelihood, has been applied by Lawley (1940, 
1941, 1943) to the problem of estimating factor loadings. 
This method has the property of using the largest amount 
of available information contained in the data and gives 
“ efficient? estimates, where such exist, of all unknown 
parameters, i.e. estimates which, roughly speaking, are on 
the average nearer the true values than those obtained by 
other, ** inefficient," methods of estimation. 

Before using the maximum likelihood method for esti- 
mating factor loadings it is necessary to make certain 
initial assumptions. We assume that both the test scores 
and the factors, of which they are linear functions, are 
normally distributed throughout the population of indi- 
viduals to be tested. ‘This assumption of normality has 
been the subject of some criticism, but in practice it would 
appear that departure from strict normality of distribution 
is not very serious. It is also necessary to make some 
hypothesis concerning the number of general factors 
which are present in addition to specifics. We shall later 
on show how this hypothesis may be tested, and how it 
may be determined whether the number assumed is, in fact, 


sufficient to account for the data. : 
2. A numerical evample—In order to illustrate the calcu- 


* For a detailed exposition of the arithmetical procedure of 
Lawley's method, with checks, see Emmett (1949). 
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lations needed we shall reproduce an example used by 
Lawley (1943b), where eight tests were given to 443 indi- 
viduals. The table below gives the correlations between 
the eight tests, unities having been placed in the diagonal 
cells. In this example the hypothesis made is that two 
general factors, together with specifics, are sufficient to 
account for the observed correlations. 


1 2 3 4 5 6 8 
1 1:000 -312 -405 -457 +500 -350 564 
2 312 1:000 -460  -316 279 — 173 +288 
8 405 -460 1-000 -394 +380 +258 +823 
4 457 316 394 1-000 +460 — -222 -486 
5 | -500 -279 -380 -460 1-000 239 ALT 
6 | 350 173 -258 222. -289 1-000 +262 
7 ‘521-889-4830 516 441 1802 DAT 
8 | -564 288 323  -486  -:417 +262 1:000 


The method of estimation about to be described is one 
of successive approximations. Each successive step in the 
calculations gives a set of factor loadings which are nearer 
to the final values than those of the previous set. To 
start the process it is only necessary to guess or to find by 
some means (e.g. by a centroid analysis) first approxima- 
tions to the factor loadings. Any set of figures within 
reason will serve the purpose, though, of course, the better 
the approximation the fewer steps in the calculation will 
be needed. For illustration we shall take as first approxi- 
mations to the factor loadings the set of values given below : 


Tests 
Trial —— a 3 
loadingin 1 2 3 4 5 6 7 8 
Factor I 738 -50 -66 — -66 <2 -40 78 70 
Factor II 17 —27 —-47 -08 66  .02 10  -29 
Specific 
variance -4382 -6771 -3435 +5580 -6120 -8896 -4571 4259 


Under the loadings are written the corresponding first 
approximations to the specific variances (the total variance 
of each test being taken to be unity). They are as usual 


found by subtracting from unity the sums of squares of 
the loadings for each test. 
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The calculations necessary for obtaining second approxi- 
mations to the loadings in factor I may now be set out as 
follows : 


(a) 1:606 +738 1-921 1-183 1:013 -476 1:597 1-644 
(b) 5-647 3-895 5-132 5-129 4-830 3-100 5-647 5:412 
(c) 46917 3-395 4-472 4469 4-910 2-700 4-917 4-712 
1/h, =0-14789 

(d) 727 +502 -661 «661 623 399 727 -697 


The first row of figures, row (a), is found by dividing the 
trial loadings in factor I by the corresponding specific 
variances. The figures in row (b) are then given by the 
inner products (see footnote, page 74) of row (a) with the 
successive rows (or columns) of the correlation table 
printed above, and row (c) is obtained by subtracting 
from the figures in row (b) the corresponding loadings in 
factor I. The quantity hi is given by the inner product 
of rows (a) and (c), and hence, taking the square root of the 
reciprocal of this quantity, we find l/h, Finally, row (d) 
is obtained by multiplying the figures in row (c) by 1/hy; 
or -14789. The resulting numbers are then second 
approximations to the loadings of the tests in factor I. 

The most direct way of obtaining second approximations 
to the loadings in factor II is to find the residual matrix 
which results from removing the effect of factor I, and to 
treat it in the same way as the original matrix, using this 
time the trial loadings in factor II. A less direct but con- 
siderably shorter method may, however, be obtained by using 
once more the original matrix and modifying the process 
slightly. The necessary calculations are as shown below : 


098 -024 219 -681 


(e) +388 —-399 —1:368 143 
413 -038 -190 +580 


(f) 330 —-560  — 980 -150 
pı = — 0234 
g) A77 —-278 —:495 :085 +068 -027 107 +306 
k? = 1-1080 1/k, = 9500 
1065-026 -102 -291 


(h) 168 —-26h —-470 -08l 


Row (e) is found by dividing the trial loadings in factor I 
by the corresponding specifie variances (thus, -388 is 
-17/-4882), while the numbers in row (f) are given by the 
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inner products of row (e) with the rows of the correlation 
table. 

The step by which row (g) is obtained from row (f) is 
a little more complicated than the corresponding step in 
the calculations for the first-factor loadings. From each 
number in row (f) we subtract not only the corresponding 
trial loading in factor II, but also a correction which 
eliminates the effect of factor I; this correction consists 
of the corresponding number in row (d) multiplied by 
— 0234, the inner product of rows (e) and (d). Thus, for 
example, the number -177 in row (g) is equal to 

:880 — -170 — -727 x (— -0284) 

In general, where more than two factors are assumed to be 
present and where further approximations are being calcu- 
lated for the loadings in the rth factor, there will be (r — 1) 
such corrections to be subtracted, one for each of the 
preceding factors. 

Having found row (g) the quantity i? is now given by the 


inner product of rows (e) and (g), from which, taking the | 


square root of the reciprocal, we derive 1/À,. Row (h) 
is then obtained by multiplying the figures in row (g) by 
1/k,, or 9500. We have thus found second approximations 
to the loadings in factor II. 

The whole cycle of calculations may now be repeated 
over and over again until the required degree of accuracy 
is reached. In practice, provided that the initial trial 
loadings are not too far out, one repetition of the process 
will usually be found sufficient. In our example the final 
estimates (with possible slight errors in the last decimal 
place) were as follows : 


Tests 
Loading in 1 2 3 4 5 6 7 8 
Factor I "25 508  -664 -661 -623 -399 .726 -694 
Factor II 172 — 261 —-468 -087 -069 -027 -106 -291 
Specific 
variance 45 -679 -340 :556 -607 .840 -462 434 


Having obtained these figures, there is, of course, no 
objection to rotating the factors as desired in order to 
reach a psychologically acceptable position. 
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3. Testing significance —A difficulty in most systems of 
factorial analysis is to know how many factors it is worth- 
while to “ take out,” and to decide how many of them may 
be considered significant.. From a statistical point of 
view objections can be raised against the majority of 
methods at present in use for this purpose. When, how- 
ever, the number of individuals tested is fairly large, the 
maximum likelihood method provides a satisfactory means 
of testing whether the factors fitted can be considered 
sufficient to account for the data. 

To illustrate this let us return to the example of the 
previous section. It is first of all necessary to calculate 
the matrix of residuals obtained when the effect of both 
factors is removed from the original correlation matrix. 
For this purpose we use the final estimates of the loadings 
as already given. The residual matrix, with the specific 
variances inserted in the diagonal cells, is as follows : 

1 2 3 4 5 6 ie 8 


(-445) —-008 004 —:037 036-056 —-024 011 
2| —.008 (-679) -004  -006 —-016 —-021  -001  -015 
3 -004 -004 (340) —-004 —-001  -006  -001 —:002 
4| —.037 -006 —-004  (-556) -042 —-044  -027. -002 
5 086 —.016 —-001  -042 (-607) —-011 —-019 —-035 
6 -056 —.091  -006 —-044 —-011  (:840) :009 —-028 
v| —.-024  .001  -001 027 —-019  -009 (462) -012 
8 011 -015 — 002  -002 —-035 —-028 :012 (484) 


We are now able to calculate a criterion, which we shall 


denote by w, for deciding whether the hypothesis that only 
two general factors are present should be accepted or 
rejected. Each of the above residuals is squared and 
divided by the product of the numbers in the corresponding 
diagonal cells. Thus, for example, the residual for 
Tests 4 and 7 is squared and divided by the product of 
the fourth and seventh diagonal elements, giving the result 


. (027) ^ ...009898 
-556 x :4602 
such terms, one for each residual, 


There are altogether 28 
ming the sum of these terms and 


and w is obtained by for 
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multiplying it by 448, the number in the sample. The 
result is found to be 20-1. 

When the number in the sample is fairly large w is 
distributed approximately as y? with degrees of freedom 
given by 

i(n — my — n — mj 


where n is the number of tests and m is the assumed num- 
ber of factors. To test whether the above value of w is 
significant we now use a y? table such as is given by 
` Fisher and Yates (1938, page 27). In our case, putting 
n = 8 and m = 2, the number of degrees of freedom is 13. 
Entering the y? table with 13 degrees of freedom, we find 
that the 1 per cent. significance level is 27.7. This means 
that if our hypothesis that only two general factors are 
present is correct, then the chance of getting a value of w 
greater than 27:7 is only 1 in 100. If, therefore, we had 
obtained a value of w greater than 27-7 we should have 
been justified in rejecting the above hypothesis and in 
assuming the existence of more than two general factors. 
In our case, however, the value of v is only 20-1, well below 
the 1 per cent. significance level. We have thus no 
grounds for rejection, and although we cannot state that 
only two general factors are present, we have no reason to 
assume the existence of more-than two. : 

It must be emphasized that the method described above 
is not applicable if other, inefficient, estimates of the 
loadings are substituted for the maximum likelihood 
estimates. For the value of y? would in that case be 
greatly exaggerated, causing us to over-estimate its 
significance. For this reason we cannot, for example, 
use the method for testing the significance of the re- 
siduals left when factors have been fitted by the centroid 
method. 

4. The standard errors of individual residuals —A method 
has now* been developed for finding the standard errors 
of individual residuals. This should be useful when a few 
of the residuals are very large, while the rest are small. 
In such a case one or more of the residuals may be highly 


* Lawley in the Proc. Roy. Soc., Edin., 1949. 
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significant, when tested individually, even though the 
value of y? does not attain significance. The method 
ignores errors of estimation of the specific variances, which 
are not, however, likely to be very large provided that the 
number of tests in the battery is not too small. 

Let us denote by li, m; the estimated loadings of the i^ 
test in the first and second factors respectively (assuming 
the existence of only two factors). Let v, be the specific 
variance of the i^ test, and let us write— 


yl? 
h-1— 
V; 

ym; 

k= : 


Then the standard error of the residual for the i" and j^ 


tests (i 4 j) is given by— 


Where Qu —49;p—u— 
i4 k 


and oi Lely ares 
e ANNO 


This formula may, of course; be easily extended to take 


into account any number of factors. ; 
Let us illustrate the use of the above formula with the 


same numerical example as before. If we wish to test the 
or the first and fourth tests 


significance of the residual f 
after removing two factors, We have— 


= "725 m = 172 v, = 44479 
E ya anges E tiers :55551 
h 67185 = 10528 

Hence ep -88845 Cs = 48329 Cya = — “08554 


emel TES (eneu E eu) = 0196 
443 

Thus the residual in question has a value of :037 with a 

standard error of 020. It is clearly not significant. 
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& The standard errors of factor loadings —When maxi- 
mum likelihood estimation has been used, we are able to 
find the standard errors of not only the residuals but also 
the estimated factor loadings. Using the same notation as 
in the preceding section, the sampling variance of l;, the 
loading of the i test in the first factor is (assuming the test 
to be standardized)— 


TRE 


and the standard error is the square root of this. 
The covariance between any two first factor loadings l; 
and J; is given by— 


Vei eu] 


The formule for the variances and covariances of the 
subsequent factor loadings are more complex. Thus the 
variance of m; the loading of the i^ test in the second 
factor, is— 


1 9b- Cae] 


while the covariance between m; and m; is 


i 1 1 1 
ETE iz 2) b = (s + 3) ll; — TO + inm) 


The results for the general case, where more than two 
factors have been assumed present, may be written down 
without diffieulty. Each faetor will give rise to one more 
term within the eurly brackets than the preceding factor. 
It should be noted that the last of such terms, and that 
alone, is multiplied by 3. 

The variances and covariances of loadings in any factor 
are those for given values of the loadings in all preceding 
factors. 

It must be stressed that all the above results are applic- 
able only to the unrotated loadings. 

In our numerical example, we find— 
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il 
iga = 1-14884 


1 
Mas 5 = 1-9498 


Hence the variance of lj, for example, is 


1-14884 
ITO dy —4 x 114884 X 725*p = -001810 
443 "à 
while that of m; is— 
1:9498 
448 1—1-14884 x -725*—3 X 1:9498 x 1722; = -001617 


Thus the loading of test 1 in the first factor is 725, witha 
standard error of— 

/-001810 = :043 
and its loading in the second factor is -172, with a standard 
error of— i 
/-001617 = -040 


6. Advantages and disadvantages—To sum up: the 
chief advantage of the maximum likelihood method of 
‘estimating factor loadings is that it does lead to efficient 
estimates and does provide a means of deciding how many 
factors may be considered necessary. It unfortunately 
takes, however, much longer to perform than a centroid 
analysis, particularly when the battery of tests is a large 
one and when several factors are to be fitted. The chief 
labour of the process lies in the calculation of the various 
inner products ; although in this respect it does not differ 
greatly from Hotelling’s method of finding “ principal 
components." The maximum likelihood method is thus 
likely to be most useful in cases where accurate estimation. 
is desirable and where it is proposed to make a test of 
significance. 

The method also possesses the advantage of being 
independent of the units in which the test scores are 
measured. The same system of factors is therefore 
obtained whether the correlation or the covariance matrix 
is analysed. The loadings in the one case are directly 


proportional to those in the other. 


PART III 
THE ROTATION OF FACTORS 


F.A.—D* 


CHAPTER X 
THE ROTATION OF FACTORS 


1. Is rotation necessary ?—The factors or axes arrived at 
by the centroid process (or as principal components) are 
not at all the same sort of things as the Spearman system 
and its extensions gave. The Spearman factors, though 
mathematical devices are used in calculating their loadings, 
have psychological meaning from the first. Their names 
Indicate this—general intelligence, the verbal factor, etc. 
There is no need for rotating them. 

With the other kind of factor, the case is different. As 
first obtained, they make no claim to have psychological 
Meaning. Their virtue is a purely mathematical virtue— 
they each explain, in turn, as much as possible of the vari- 
ance of the tests, and arrive with as few common factors 
as possible at negligible residues. The loadings of the 

_ first centroid* factor are usually all positive, and it runs as 
a positive factor through all the tests. But it is not as a 
rule identical with Spearman's g. The succeeding cen- 
troid factors have each negative loadings in about half the 
tests, and are often referred to as bipolar factors. They 
may be looked upon as repeatedly classifying the tests into 
subgroups, and this classification may be expressed by a 
kind of family tree : 

Factor I All loadings positive 
l 


r ] 
Factor IT Positive loadings Negative loadings 
I | 


Factor III Positive Negative Positive Positive 
loadings loadings loadings loadings 

Not infrequently the sub-families into which this bipolar 
classification analyses the tests will have something psycho- 


* This is the most convenient name, to avoid verbosity. But 
Unless it is otherwise stated, may it be understood that principal 


components are equally referred to. 
139 


140 THE FACTORIAL ANALYSIS OF HUMAN ABILITY 


logical in common, and to that extent these factors in such 
cases may claim to have psychological meaning. Much 
depends on how the battery of tests is made up. And 
such bipolar classification is more natural in tests of tem- 
perament and character, where common speech has many 
bipolar phrases (as brave-cowardly, modest-cheeky, etc.), 
than in tests of an intellectual nature, though there too 
bipolar pairs of words are found, like clever-stupid. 

Many psychologists, however, especially if they tend to 
look upon factors as real mental entities, even perhaps with 
physiological causes, find it difficult to admit all those 
negative loadings. -A mental ability or factor, they argue, 
is on the whole something which helps us to do things, not 
hinders. A few negative loadings they can understand : 
but not so many as half the loadings. So they wish to 
turn the centroid axes into positions where most of the 
loadings will be positive, and morcover positions to which 
they can give psychological meaning, and which will be 
found and be recognizable in different batteries of tests. 
For this purpose the factor-analyst must be instructed in 
methods of rotating the centroid factors into new positions: 

2. Methods of rotation.—One method, Alexander’s, has 
already been described earlier in this book on pages 79 
to 80. It was used by Alexander himself with excellent 
effect (Alexander, 1935), but involves assuming (a) that the 
communality of a certain test is entirely due to one factor ; 
(b) that the communality of a second test is entirely due 
to this factor and one other ; (c) and so on for r — 1 tests, 
where 7 is the number of factors. The criterion of success 
with this method is to see whether, when these assumptions 
are made, negative loadings disappear ; and whether -the 
consequent loadings of those tests about which no assump- 
tions are made are compatible with the psychologist’s 
psychological analysis of them. Alexander’s assumptions, 
however, cannot generally be made in a usual battery of 
tests, and other methods of rotation are required. The 
simplest plan is to rotate the factors two at a time in their 
own plane. An example will best explain this. 

3. Two-by-two rotation.—Let us suppose that we have 
the following set of loadings in eight tests for three factors : 
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| 


1 | 4 4 1 33 
OY Teu 3 —-4 Th 
DM y 4 scu 
Zi x d 8 | -91 
5 5 2 —-2 | 88 
6 | 8 —-4 3 | -81 
7 6 5 =-2 | -65 
Sm | <5) Se NC aD 


ant to rotate to positions of the 
e no negative loadings, or at 
We shall do this taking 
ach pair in its own plane. 


Suppose further that we w 
three axes where there will b 
least only few, and those small. 
the axes two by two, and rotating € 
Take first axes I, and I), 
where the subscripts indicate I 
that no rotation has yet | 
taken place. Draw a dia- Y 
gram, using the loadings on x 
a and IT, as co-ordinate axes | d Ng 

igure 21). We can see at dies 2 
once that if we rotate the `, u^ Jr 
E to new positions I, and 3 “| 

I, they will enclose all the FSS 

test points in their positive — / iss 
quadrant, and all the load- ~ setae oe 
ag on these two axes will ps 
i positive. The position is, À 
owever, not unique, for we 
could have rotated a little 
farther, or a little less, than 0 
pointe: I have taken 0 as 37°, with s 

= “8. 

Consider now the point 5. 
axes were -5 and -2, and clearly its ne 
.5 cos 0 — -2 sin g = 28 

and -5 sin 0 + -2 cos 0 = -46 
These can be checked approximately on the diagram, and 
this should always be done, at least by eye if not by 


Figure 21. 


and still enclosed all the 
ine 0 = -6 and cosine 


Its co-ordinates on the former 
w co-ordinates are— 
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measurement. The new loadings of each of the tests can 
be calculated in the same way, giving— 


I, II, | Sum of squares 
1 08 -56 :32 
2 38 66 58 
3 68 -26 -58 
4 78 46 :82 
5 28 46 :29 
6 :88 16 :80 
7 :-8 76 *61 
8 58 i) BA 


At this point two checks should be made: (1) The sum 
of the squares of the loadings of any test in these two factors 
should not have altered. Thus -08* + -56* is the same as 
.42 + -42 for the first test. (2) The inner product of any 
pair of rows should not have altered. Thus, for the first 


two tests : 


and 


AX T44 x 8 = 40 
08 x 88 + -56 x :66 = +4000 


It is sufficient to check only adjacent rows. 


Our three axes are now 
I, IL, and IIl, and III, 
still has negative loadings. 
We must therefore rotate it 
with one of the others, 
which will have its loadings 
further changed. Let us 
choose I, and III,, and with 
their loadings make this 
diagram (Figure 22). 

A little trial with a square 
corner of a piece of paper 
shows us that we cannot 
rotate the axes to a position 
which will completely en- 
close all the points, though 
we very nearly can. We 
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finally decide to make I, go exactly through point 2, whose 
co-ordinates in this diagram are -38 and — -4. The sine 
and cosine of 0 are therefore : 


E :38 
and gg ae 
or -725 and :689 
(check that -725* + -689* = unity) 
The loadings of the point 5, for example, are then : 


.689 x -28 — 725 x (— :2) = 388 on I, 
and 495 x -28 -+ -689 x (— :2) = :065 on IIT, 
as can be approximately checked by a look at the diagram. 
In the same way the other loadings on T, and III, can be 
found, giving the complete table : 


I, Il, IIl, h? 
1 — ‘017 -560 127 +3300 
2 +552 -660 «000 -7403 
8 -686 -260 +286 +6200 
4 :320 -460 UU -9100 
5 +338 -460 :065 +3301 
6 5934 :160 -707 -8106 
7 :269 -760 — 007 +6500 
8 110 060 -696 -5001 


The sums of squares of each row ought to give the same 
values for h? as did the original table in Ip, Mo, and TI). 
And the inner product of any pair of rows ought to be 
identical also. For example, taking the last pair (it is 
sufficient to check adjacent rows), we have from this table : 


:269 x -110 + -760 X -060 — :007 X -696 = :0703 
and from the other : 
6 x 5 —°5 x38—:2xX -4 = -07 


We have now succeeded in replacing our original analysis, 
which had many negative loadings, by one which has only 
positive loadings (except for the two loadings which, 
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although negative, are nearly zero) and gives the same 
correlations and communalities. 

4. An orthogonal rotating matriz.—1f the reader will, in 
imagination, picture in his mind those original axes Ip, Io, 
and III, as three lines at right angles to each other (ortho- 
gonal, as we say), he can further imagine them being 
turned bodily, using their common mecting-place as the 
swivelling-point and keeping them orthogonal, into their 
final positions I; I}, and II. Actually we did it in two 
steps, but imagine it happening as one complex movement. 

Arithmetically, this one movement can be imitated by 
* post-multiplying the original table of loadings by an 
orthogonal matrix," a piece of jargon we must hasten to 
explain. And the reader may miss this section out on 
first reading. A matrix, in mathematics, is an oblong or 
square set of numbers, to be used as an operator on other 
quantities. In our case it is to be used to rotate the original 
loadings to new positions. And since we want the axes to 
remain orthogonal, we have to use an orthogonal matrix, 
i.e. one in which the sum of the squares of any column or 

row is unity, and the inner product of any pair of rows or 
of columns is zero. Actually the orthogonal matrix which 
performs the rotation of the above section 3 is: 


-5512 -6000 -5800 | 
| — 4184 -8000 —:4850 
|.— -7250 -0000 :6890 | 


(The reader can check the sum of squares of any column or 
row, and any inner produet of a pair.) Before explaining 
how these numbers are arrived at, let us first perform the 
post-multiplication of the table of original loadings (itself an 
oblong matrix) by this rotating matrix— 


5 2) 
=3 4| 


+269 -760 —:007 
110 -060 -696 


4 -4 1| . | -5512 -6000 -5800 | —-017 -560 127 
7-8 —4| X | _.4134 -8000 —:4350 | ^ | -552 -660 000 
Tf AN | —-7250 -0000 -6890 | -686 -260  :286 
| deca dU M a 820 -460 — 7772 
ee EEn 388 -460 065 
3 mire “584 -160 -707 
5 


= 


i 
|| 
j 
} 


f 
| 
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We have to say post-multiplication because in matrix 
algebra the product 4B is not the same as the product BA. 

Matrix multiplication is performed by finding the inner 
product of each row of the first matrix with each column of 
the second matrix. Thus— 
-4 X +5512 — -4 X -4184 — -1 X -7250 = — 0174 or — -017 
the first item in the product matrix above. Similarly, the 
quantity -707, which appears in the sixth row and third 
column of the product matrix, is the inner product of the 
sixth row of the first matrix and the third column of the 
second— : 

*8 x -5800 + :4 x :4350 + 1 X +6890 = -7069 or 707 
'The reader can similarly check the other entries in the 


product matrix. 
When we performed the first of our previous two-by-two 


rotations we were in effect post-multiplying the loadings by 
the rotating matrix— 


Mipi o o | 
aia 87 2008 
E 0 OR OM 


which will leave the column III, unchanged because of the 
nature of the third column of this rotating matrix. The 
inner product of 0, 0, and 1 with any row of the centroid 
loadings will give a column of loadings identical with III,. 

When we performed the second two-by-two rotation, of 
I, and III, we were in effect multiplying by the matrix— 


*689 0 “725 
-000 1 -000 
— -725 0 -689 


oes not alter the middle axis. And the 


which clearly d 
hich would have done these two opera- 


rotating matrix w 
tions simultaneously is— 


ds qu +689 0 725 -5512 -6000 -5800 
—6 “8 0| x 000 1 -000| = | —4134 -8000  — 4350 
0 0 1 —.725 0 -689 |7250 -0000 -6890 
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5. Reyburn and Taylor's method.—These South African 
psychologists have proposed to let psychological insight 
alone guide the rotations to which axes are subjected. 
They do not necessarily insist on a g (see their 1941a, pages 
258, 254, 258, etc.). Their plan is to choose a group of 
tests which their psychological knowledge, and a study. of 
all that is previously known, leads them to consider to be 
clustered round a factor. They therefore cause one of their 
axes to pass through the centroid of this cluster, keeping all 
axes orthogonal. This factor axis they do not subse- 
quently move. They then formulate a hypothesis about 
a second factor and select a second group of tests, through 
whose centroid (retaining orthogonality) they pass their 
second factor axis. And so on. There is some affinity 
between this and Alexander’s method of rotation (see 
page 79). 

The arithmetical details of their method are as follows. 
They first obtain a table of centroid loadings in the usual 
way. Then, having chosen a group of tests which they 
think form, psychologically, a cluster, they add together 
the rows of the centroid table which refer to those tests, 
thus obtaining numbers proportional to the loadings of 
their centroid. These, after being normalized, form the 
first column of their rotating matrix. For example, 
consider this (imaginary and invented) table of loadings : 


Loadings 

Th Ub TATE) 
1 d 3 Bi :96 
Beth ety "10 
3 “6 —-8* e -54 
4 E] 2 1 -30 
5 A 4 —2 -36 
6 5 —4 2 -45 
7 5 2 — 1 -30 
8 a 4 Bi -66 
9 A cm» 3 62 
10 | 6 —-4  -4 | -68 
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3 Nd Taylor now decide, let us suppose, that Tests 
ic Me their psychological view, very strongly 
their e : d ith a verbal factor, and determine to rotate 
NH factors until one of them passes through the 
MEE hese two tests. They extract their rows, add 
ogether, and normalize the three totals thus : 


(9) 7 —-2 3 
(10) -6 —! A 
Bo ket ee 
3 =% 7 Sum of squares 2:54 = 1:594° 


816 —-376 -489 obtained by dividing by 1:594 


Ift 

ae oe of the original table are multiplied by these 

Mun af d and the rows added, the result is the first 

To get the n rotated factor loadings in the table below. 

B oix o z two columns we must complete the rotating 

HOW thi such a manner that the axes remain orthogonal. 
his is done will be. explained separately later. 


Me ` 
a . 
nwhile, consider the matrix— 


-816 -399 417 
—:376 —183 -909 
439 —:898 j 


Its fi T 
irst ; : 
column is composed of the above numbers. It is 
of any row or column 


orth 

is i ae for the sum of the squares 

the ae and the inner product of any two is zero. When 
ginal table of loadings is post-multiplied by this we 


get the rotated table : 


Rotated Loadings h? 
1 +258 015 -440 -260 
2 -257 793  —:064 -699 
3 -471 .564 —:022 -540 
4 B77 073 -390 -800 
5 -088 -266 -530 -359 
6 646 .098 —"155 -450 
tí -289 -253 -390 +300 
8 465 116 656 -660 
9 778 047 -110 -620 
10 .816 —:047 —:113 -681 
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jc E 
Order 4 a b —c —d 
b —a d —c 
G d a b 
d —c —b a 


This one was used by Reyburn and Taylor in their 1939 
article (page 159). 

Similar matrices of higher order can be made by a 
recipe given by them, viz. multiplying together two or 
more of the above, suitably extended by ones and zeros. 
For example, a matrix orthogonal and with arbitrary first 
column, of order 5, can be made by multiplying together : 


C Hp Ap n | | mq —Iq p 4 ` 
ey 13 —àn —ọ mp  —lp —q : E 
: À u |x l mn 5 z 5 
atris ; ay || 1 
1 b . . . | . 1 


ls 


where? + m? =p +E — X 4 y? —z 49-1. 

7. Principles deciding where to stop rotation.—We have 
mentioned two principles, (a) the desire to rotate to posi- 
tions where there will be few, if any, negative loadings 
—but usually this is insufficient to define a final position 
uniquely, and (b) Reyburn and Taylor's plan of following 
their psychological intuition in placing the axes. They 
too accept the need for mainly positive loadings, and they 
keep their axes at right angles. We turn now, in our next 
chapter, to a principle (Simple Structure) which is accepted 
widely in America, though hardly at all in Great Britain. 


CHAPTER XI 


ORTHOGONAL SIMPLE STRUCTURE 


l. Agr 
greement of mathematics and qsychology.—1t is clear 


that 
Ens e process of multifactor analysis is one by 
satisfying E es of the primary factors is arrived at by 
and certain Es taneously certain mathematical principles 
sides of the psychological intuitions. When these two 
Sense of WOMIT click into agreement, the worker has a 
Support one ing made a definite step forward. The two 
along this ddr. Obviously the goal to be hoped for 
mathematical | of advance will be the discovery of some 
factors p process which always leads to a unique set of 
could be nn y acceptable to the psychologist. If such 
Over and D I and found to produce a few factors 
Means, the ve those recognized as already known by other 
Acceptance End factors would stand a good chance of 
only. And n the strength of their mathematical descent 
to make a f La doubt the psychologist would be prepared 
to fit in Gm concessions and changes in his previous ideas 
much satisf any mathematical scheme which already gave 
results, action and was objective and unique in its 

It i 
1s ee that Thurstone’s notion of “ 
that the ae a solution (Vectors, Chapters 6-8 
them are xes are to be rotated until as many 2S 
origina] LM right angles to as many as possible of the 
or de inm vectors; and that the battery 1$ not suitable 
Possible ET factors unless such à rotation 1S uniquely 
angles to DR which will leave every axis at right 
every test east as many tests as there are factors, and 
hend É right angles to at least one axis. 
angles, th e vectors of a test and & factor are at right 
hurstor e Ton ding, of the factor in that test is Zero. 
a large ne’s “ simple structure » is therefore indicated by 
number of zeros in the matrix of loadings, so large 
151 


simple structure os 
). This idea is 
possible of 
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At this point the usual two checks must be made, of h? and 
of the inner products of consecutive rows. 

The first factor now goes through the centroid of Tests 
9 and 10, and we scan the loadings it has in the other 
tests to see if these are consistent with their psychological 
nature. For instance, Test 5 has practically no loading on 
this verbal factor—is this consistent with our psychological 
opinion of this test ? 

If this scrutiny is satisfactory, the psychologist using 
this method then proceeds to consider where he will place 
his second factor ; for the second and third columns of the 
above loadings have still no necessary psychological mean- 
ing as they stand. Exactly the same procedure is carried 
out with them, the first column being left unaltered. 
Suppose the psychologist decided on Tests 5, 7, 8 as being 
a cluster round (say) a numerical factor. He adds their 


TOWS— 


(5) 266 +530 
(T) -258  -390 
(8) 116 -656 


685 1-576 
B74 -928 when normalized 


and uses their normalized totals as the first column of a 
matrix to rotate these last two columns. The matrix 
must be orthogonal, and it is in fact— 


When the second and third columns are rotated by post- 
multiplication by this, the final result is given opposite. 
(The same checks must now be repeated.) The psycho- 
logist now scans column two to see if the loadings of his 
numerical factor agree reasonably with his idea of each 
test, and is rather sorry to sce two negative loadings, but 
consoles himself by thinking that they are small. He 
must finally try to name his third factor, present to an 
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Final Rotated Loadings 


i | -258 -414 —-151 
2 | 25; 2897 "760 
3 -Ari — 3191 — 582 
4 | 377 8389 — 078 
5 | -088 -591 -049 
a 040 ee 
7 | 289 -457 -089 
8 | -465 -652 —7188 
9 | «78s 120 002 
i0 | -816 —-122 —-001 


appreciable extent only in tests 2 and 3. If he thinks he 


recognizes it, he is content. 

6. Special orthogonal matrices.— 
above process the reader needs to 
orthogonal matrices of various sizes, S 
the first column any desired values. 
aaa purpose. Except for the first one, 

, and alternatives can be made. 


To carry out the 
have at his disposal 
uch that he can give 

The following will 
they are not 


Order 2 u U | 


Order 3 mq mp L| B-m-1 
—lg —lp ™| p+e= | 


e matrix used in the last 


It was from this formula that th 
— 976, -489, was made. 


section, with first column of -816, 
For if we set 

p= -439 

we have 4 = -898 

and from mq = -816 

we have m= -909 

and thence 1 = 417 
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that there will be only one position of the axes (if any) 
which satisfies the requirement. His search, be it repeated, 
is for a set of conditions which will make the solution 
unique. We have seen him approaching this goal by 
stages. "Unless the battery is large, so that— 


M Gr +1) + V(8r + 1) 
2 


(see Chapter V, Section 9), the communalities are not 
unique. Even when the battery is large enough, the axes 
representing factors may be rotated to positions among 
which there is no one specially marked out. Then comes 
the demand that there be this large number of zero loadings. 
Most batteries of tests will not allow this demand to be 
satisfied, but with some it can just be attained. Only 
these last, it is Thurstone’s conviction, are suitable for 
defining primary factors, and it is his faith that the factors 
thus mathematically defined will be found to be acceptable 
as psychologically separable unitary traits. 

2. An example of six tests of rank 3.—To make our 
remarks more definite and concrete, let us suppose that 
we have a battery of six tests whose matrix of correlations 
can be reduced to rank 3. In practice, of course, six tests 
are far too few, and more than three factors quite likely. 
The matrix of loadings given by the “ centroid " system 
contains at first negative quantities. Thus from the 
correlations : 


[4s 2 8 4 5 6 
VOUS -525 000 -000 448 000 
2 | 525 à -098 306 -349 000. 
3 | -000 -098 ; 138 314 -504 
4 | -000 -306 -188 : -000 -000 
5 |.-448 849 -314  -000 3 -307 
6 | -000  -000 504  -000 307 3 


with the communalities— 
674 -634 -558 “415 :490 :493 


we get by the “ centroid " process the matrix of loadings : 
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[f II III 


:542 612 074 
629 +342 — +348 
-529  —:492 191 
281 —-182 —-550 
+628 143 274 
-429 —:424 :359 


ane wwe | 


A Tt is the factor axes indicated by these loadings that 

Thurstone wishes to rotate until there are no negative 
loadings and enough zero loadings to make the position 
uniquely defined. For this last purpose he finds, empiri- 
cally, that it is necessary to require— 

(a) At least one zero loading in each row ; 

(b) At least as many zero loadings in eac 
there are columns (here three) ; and 

(c) At least as many XO or ox 
of columns as there are columns. 
meant a loading in the one column op 
other, 

i At least one zero loading in each row.” This means 
that no test may contain all the common factors. In 
making up the battery, then, the experimenter, with some 
idea in his mind as to what the factors are, will endeavour 
e ensure that they are not all present in any one test. 
This would, for example, exclude from a Thurstone battery 
(except as an extra) any very mixed group test, or a mixed 
test like the Binet-Simon which is itself a whole battery 


of varied items. 

(2 At least as many zeros in cach column as there are 

Columns," that is, as there are common factors. This 

means that in a Thurstone battery no factor may be general, 

but must be missing in several tests. 
3) The requirement as to the number of XO or OX entries 

is intended to ensure that the tests are qualitatively 


distinct from one another. 

Now, these requirements cannot generally be met by a 
matrix of loadings. It will in general be impossible to 
rotate the axes (keeping them orthogonal) until every 


h column as 
entries in each pair 


By an XO entry is 
posite a zero in the 
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axis is at right angles to r test vectors. The above artificial 
example has, however, been constructed so that this can 
be done. 

The correlations were in fact made from the loadings : 


| A B G 
1 | - :821 
2 UTD 639 
3 718 +206 
4 | c 644 : 
5 | 488 D :546 
6 | “702 


and the centroid loadings must therefore be capable of 
being rotated rigidly into this form, retaining ortho- 
gonality. 

3. Two-by-two rotation to simple structure.—The problem 
for the experimenter, however, is to discover this “ simple 
structure," if it exists; he is not, like us, in the position of 
knowing that it does exist, and what it is. Thurstone’s 
original method was to use two-by-two rotations, in each 
i rotation endeavouring to obtain some 

zero loadings. Let us illustrate by our 
artificial example, taking first the centroid 
factors I and II. Using their centroid 
loadings as co-ordinates, we obtain Figure 
23. At once we notice that the test 
points 3, 4, and 6 are almost collinear 
on a radius from the origin, and that 
if we rotate the axes clockwise through 
about 42° the new position of I, labelled 
I, in the diagram, will almost pass 
through these test points, while the new 
S axis II, will almost pass through test 
point 1. On these new axes, therefore, Tests 3, 4, and 6 
will have hardly any projections on axis II, ; that is, will 
have hardly any loadings in a factor along I. From 
tables we find sin 42? = -669, and cos 42° = ‘743. We 
have then : 1 


Figure 23. 
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Old loadings New loadings 
I II L IL 
1 +542 612 | —-007 -817 
2 | .629 -342 | -289 675 
3 -539 —-492 | -722 —-012 
4 281 —-182 | +331 :058 
5 *628 348 | 371 :526 
6 | 429 —-424 | -602, —:028 
multipliers 743 —-669 for I, loadings, 


| 669 -743 for II, loadings. 

We have now obtained our desired three zero (or near zero) 
loadings in factor IJ. Accepting the approximations to 
zero as good enough for the present, 
we next make Figure 24 from the 
loadings of I, and III in the same 
way as we made the former figure: 
In this, Test 1 falls quite near the 
origin. Tests 5 and 6 are approxi- 
mately on one radius, and Tests 2 
and 4 on another, and these radii 
are at right angles to one another. 
If we rotate the axes I, and III 
rigidly through a clockwise turn 
of about 49? they will pass almost Figure 24, 
through these radial groups and 
nearly zero projections will result.* Using sin 49° = -755 
and cos 49° = -656 we perform a similar calculation to the 
preceding, using the loadings I, and III as starting-point and 
obtaining loadings on I; and III, (the subscript indicating 
the number of rotations that axis has undergone). We 
have finally, putting our results together, the table of 
loadings overleaf FA.T 


* The rotation might with advantage have been carried a little 


further, 

T The matrix symbols, using Thurstone's n 
the convenience of mathematical readers. 
them. When the tests are many and the centro 
be effected by picking tests equal in number to the fact 


otation, are given for 
Others should ignore 
ids few, a saving can 
ors and per- 
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| 4% T UT 

| —060 -817 -043 
| 


1 

2 -420 675 —-048 
3 :329 —-012 -670 
4 :632 053) —-111 
5 :087 . -526 -460 
6 124 —-028 -690 


Clearly, this is an approximation to the loadings of the 
factors 4, B, and C which we who are in the secret (as a 
real experimenter is not) know to have been used in making 
the correlations : IIT, here is A, I, here is B, and II, is C. 
'The small loadings are not quite zero, and the other load- 
ings not quite the same, but a further set of rot 
would refine the results and br 
A B C values. 

4. New rotational method.—When this two-by-two rota- 
tional method is used on a large battery of tests, with 
perhaps six or seven factors instead of three, it is not 
only laborious but somewhat difficult to follow. Thur- 
stone has, however, devised a method of rotation which 
takes the factors three at a time, and to this we now turn, 
still using our small artificial example as illustration. In 
this example, since there are only three factors, this new 
method leads to a complete solution at once. With more 
factors the matter would be more complicated. 

If the reader will think of the three centroid factors as 
represented by imaginary lines in the room in which he is 
sitting (Figure 25), he will be aided in following the 
explanation of this new method. Imagine the first 
centroid axis to be vertically in the middle of the room, 
and the other two centroid axes on the carpet, at right 
angles to the first and to each other. The test points are 
1n various positions in the room space, if we take their three 
centroid loadings as co-ordinates and treat the distance from 
floor to ceiling as unity. Imagine each test point joined 


ations 
ing them nearer to the 


gue pou rotations on their loadings F,. Let the result- 
oadings s Then R= Fay E m 
matrix on the whole AR B , Can be used as a rotating 


ble F of centroid loadings. 


The tests chosen 
esent different clusters, 


to form F, should repr 


q 


ORTHOGONAL SIMPLE STRUCTURE 15' 


by a line to the origin (in the middle of the carpet, where 
the axes cross). The lengths of these lines are the square 
roots of the communalities, and the loadings on the first 
centroid factor are their projections on to the vertical axis, 
the height, that is, of each test point above the floor. 


Figure 25 (not to scale). 


Thurstone now imagines each of these lines or com- 
munality vectors produced until it hits the ceiling, making 
2 pattern of dots on the ceiling. These extended vectors 
now all have unit projection on the first centroid axis, 
for we agreed to call the distance from floor to ceiling 
Unity. Their y and 2 co-ordinates on the ceiling will ue 
Correspondingly larger than their loadings on the Ma 
and third centroid factors, and can be obtained by divi nr 
each row of the centroid loadings by the first loading. ju 
Our case this gives us the following table; obtained in 
manner just mentioned from the table on page 153. 


Extended centroid projections 


| meee II, III, 
C. [3:000 11:129 187 
2 544  —:598 
A ‘i —-980 -B61 
4 NES 
5 i4 .928 -436 
22 ORBI a EEN 

6 » 
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The second and third columns are now the co-ordinates 
of those dots on the ceiling of which we spoke. A diagram 
of the ceiling, seen from above, is given in Figure 26 and 
the important point about 
it is that the dots form a 
triangle. 

If the reader will now 
pieture this triangle as 
drawn on the ceiling of 
his room, and remember 
that the origin, where the 
centroid axes crossed, is in 
the middle of the carpet, 
he can next imagine an 
inverted three-cornered 
pyramid, with the triangle 
on the ceiling as its base, 
the origin in the middle of 


the carpet as its apex and 
the communality vectors 1, 4, and 6 as its edges. The 


vector 5 lies on one of the faces of this pyramid ; vector 
2 lies on another; vector 3 lies on the remaining face, all 
springing from the origin and going up 


Figure 26, 


zeros we desire. The three vectors 
one face, and will have zer 
at right angles to that face, 

ave Zero projections on the li 


th an i A i k 
original 4, B, and o €y can be identified with the 


tions of the three sides 
Where there are many te 
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collinear, one plan is to draw a line through them by eye, 
and measure the distances a and b it cuts off on the axes, 
then using the equation— 


Mire 
cM LE 


Or we can write down the equations of the lines joining 
points at the corners, either actual test points, or the places 
where our lines intersect, using the equation— 
(lo — mu) + (m — v) y + (u —l)z =0 
when J, m are the co-ordinates of one corner, and u, v of 
another. We obtain in our case— 
— 9-121 + 2:094y — 1-7772 = 0 for line 1, 
— 1:080 + :700y + 21172 =0 », » 1, 5, 
2-476 + 2-794 + 340z — 0 4, » 4,3, 6 
where y means the extended II, and z the extended III. 
Before we go further we have to divide each equation 
through by the root of the sum of the squares of its 
coefficients, so that the new coefficients sum to unity when 
squared—this is called normalizing and is necessary in 
order to keep the communalities right and for other reasons. 
The equations then are : 
— -611 -+ -608y — 5122 =0 (1) 
— -436 + -288y + 8542 =0 (2) 
-860 + -745y + 0912 — 0 (3) 
and it is clear, from the way in which they have been 
reached, that these equations will be satisfied by the ex- 
tended co-ordinates of certain of the rows in the table on 
page 153. Consider the first equation and write its co- 


—.611 -603 —-512 | Weighted 


2, 


4 
6 


@ y z | sum 
1 +542 -612 074 | -000 
2 -629 -342 —-348 | -000 
3 -529 —:492 391 | —-718 
4 281 —-182 —-550 | -000 
5 -628 143 274 | —-488 
6 -499 —-424 +859  —-701 
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efficients above the columns of that table, placing — -611 
over the first column, as shown at the foot of page 159. 
If we multiply each column by the multiplier above it 
and add the rows we get the quantities shown on the right 
for comparison with page 154. The zeros are in the right 
places for factor 4. The other loadings are, however, 
negative, but that can be easily put right by changing all 
the signs of the multipliers, which we are at liberty to do. 
Similarly, using eqns. (2) and (3) we get the loadings of 
factors B and C exactly, except for an occasional difference . 


due to rounding off at the third decimal place. We have, 

indeed, found the matrix product FA, 

T A E : 
‘542 .612 -074 | 5011 -436 -660 | = ; . 821 
(020 342 —-348 —:003 —:283 -745 . 475 «039 
529 —:492 -191 512 —-854 -091 | "8 -206 
281 —-182 —-550| | O ROTHER 
‘628 143. -274 488  . -546 
429 —-4924 -359 s 


"702. 
except, as has been already said, for 
crepancies in the third decimal place. 

have described has enabled us to discov 
with which, in fact, we began. 
deduetion sound ? 


occasional dis- 
The procedure we 
er this last matrix, 
And by analogy (is the 
) an experimenter with experimental 


w they may have been e. 
The matrix A beginning with ‘611 is the rotating 
matrix which turns the axes I, IT, III into the new posi- 
tions 4, B, C. Its columns are the direction 
4, B, and C with refere 

I, I, II. Are 4 B 


611 —.603 512 611 436 -660 | i 
pie A G03 i283 ode 

3 HAB 0 | 5 ; Y 
ioe 745 091 | | -512 —'854 -091 


T me n ns 
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(again allowing for third decimal place discrepancies). 
That is to say, the angles between 4, B, and C have zero 
cosines, they are right angles. 

The axes A, B, and C were drawn at right anglès to the 
three planes which form the pyramid mentioned above, and 
therefore these three planes are also at right angles to one 
another. (Our rough sketch in Figure 25 made the pyra- 
mid too acute.) It follows that 4, B, and C are actually 
the edges of the pyramid. In our example (though this 
need not be the case) they happen to pass each through a 
test point in the room, 4 through Test 6, B through Test 4, 
and C through Test 1. These tests are not identical with 
the factors, for each test contains a specific element, not in 
the common-factor space, but at right angles to it. What 
we have called a test point is the end of the unit test vector 
projected on to the common-factor space. The complete 
test vectors are out in a space of more dimensions, of 
which the three-dimensional common-factor space is a ` 
subspace. 

6. Landahl preliminary rotations.—When there are more 
than three centroid factors, the calculations are not so 
simple. If the common-factor space is, for example, 
four-dimensional, then the table of extended vectors, in 
addition to its first column of unities, will have three other 
columns. The two-dimensional ceiling of our room, in our 
former analogy, has here become three-dimensional, a 
hyper-plane at right angles to the first centroid axis. On 
paper its dimensions can only be graphed two at a time, 
and no complete triangle will be visible among the dots. 
But sets of dots will be seen to be collinear, lines can be 
drawn through them, and a procedure similar to that out- 
lined above followed. This will become clearer when we 
work a four-dimensional example. First, however, it is 
desirable to explain, on our simple three-dimensional 
example, a device which facilitates the work on higher 
dimensional problems, called the Landahl rotation. It is 
unnecessary in the three-dimensional case, and we are 
using it only to explain it for use with more than three 


dimensions. 
A Landahl rotation turns the centroid axes solidly 


F.A—6 
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to a position where each of them is equally inclined to the 
original first centroid axis. In our imagined room the first 
centroid axis ran vertically from the middle of the floor 
to the middle of the ceiling, while the other two were 
drawn on the floor itself. Imagine all three (retaining 
their orthogonality) to be moved, on the origin as pivot, 
until they are equally inclined to the vertical so that they 
enclose the inverted pyramid of Figure 25. That is a 
Landahl rotation. The lines through the test points have 
not moved. They remain where they were, and still hit 
the ceiling in the same pattern of dots. The projections of 
the extended vectors on to the original first centroid 
axis all still remain unity. But for the next step in this 
method we need their projections on to the Landahl axes, 
We obtain these by post-multiplying the matrix of cen- 
troid extended loadings by a Landahl matrix, an orthogonal 


matrix with each element in its first row equal to T 
c 
where c is the order of the matrix ; that is, its number of 


rows or columns (Landahl, 1938). We need a Landahl 
matrix of order 3, for example : 


| -577 577 -577 | 
816 —-408 —-408 
000-707  — 707 


The element -577 is the cosine of the angle which each axis 


makes, after rotation, with the original position of the first 
centroid axis. 


When the table of extended v 


ector projections on page 
157 is post i a 


-multiplied by the above matrix, the table on 
page 163 results, giving the projections of the extended 
vectors on to the Landahl axes L, M, N. 


From this table three diagrams LM, LN, and MN can 
be made, and the reader is 


of them shows 


m a multi-dimensiona] problem several are 


—s 
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Projections on Landahl aacs* 


L M N 
1 1-498 -213 -020 
2 1:021 —-066 “TAG 
3 — 4182 1:212 "701 
4 048 — +542 2:225 
5 760 “792 176 
6 —:229 1-572 -388 


needed, and as a rule only one line is used on each diagram 
employed. Here, from the LN diagram we find the 
equations of the three sides of the triangle to be : 


—2:205l — 1:450n + 3-332 = 0 
-868l +.1:727n — -586 = 0 
1:8371 — -277n + 528 = 0 


We want to make these homogeneous in l, m, and n, and so 
we add, after each of the numerical terms, the factor 
-577 (Lm +n), which equals unity. The equations 
then are : 


_.289] + 1:923m + -478n — 0 
0301 — -838m + 1:389n = 0 
2149] + 305m + -028n =0 


axes are not infrequently 


* After a Landahl adjustment the 
Tt is sometimes worth while 


already near simple structure, as here. mes | wh 
to rotate them slowly round the original first. centroid, like spinning 
an umbrella, to improve the approximation to zero entries. This 
can be done by an orthogonal matrix whose columns sum to unity, 


as e.g. 


-9900 —-0946 :1046 
-1046 -9900 —:0946 
| —-0946 -1046 +9900 


or its transpose: and the rotation will be the slower, the nearer the 


diagonal elements are to unity. 
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After normalizing, these become : 


— 1411 + -961m + -236n = 0 
0211 — -236m + -971n = 0 
9901 + -141m + :013n = 0 
Writing the coefficients as columns in a matrix, and 
premultiplying by Landahl’s matrix (since at an earlier 
stage we post-multiplied by it) we obtain : 


| -609 -436 -660 
| —608  — 988 -745 
518 — —-858 -090 


the same matrix A as we arrived at (page 160) without the 
use of Landahl’s rotation. The advantage of using a 
Landahl rotation appears only in problems with more 
than three common factors. The reader can readily make 
a Landahl matrix of any required order, say 5. Fill the 
first row with the root reciprocal of 5, -447. Complete 
the first column by putting in the second place -894 
(because -447* + -8942 = 1), and below that zeros. The 
second row must then be completed with equal elements, 
all negative, such that the row sums to zero. Then the 
second column is completed in a similar way, and the third 
row, and so on. The reader should finish it. There are 


alternative forms possible, one of which is used below. 
An unfinished Landahl matrix B 


| 447. AAT — dT AAT AMT 
894 —-224 994 . .994 


| —:224 
| *000 ‘866 —:289 —.289 — +289 
|-000 ^ 000 
| 000  .000 
| 
7. A four-dimensional ecample.—The following example 
of 


structure can be arri 


ved at. 
four centroid factors 


with the 1 
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Centroid loadings F 
I II III IV 


O o0 - 0 o0 £0 rt- 
Ex 
oO 
= 
[:] 
[21 
oo 
e 
[73 
© 
= 
œ% 


-645 «807 —:357 —-109 
After these have been “ extended ” (i.e. divided in each 


row by the first loading) they were post-multiplied by a 
Landahl matrix, one of the alternative forms, viz. : 


= 


oe à Xx 
co 
| 
e 
| 
St 


and the resulting projections on the Landahl axes were 
thus found to be : 


L M N Ip 
Jum ~ 1007 -704 -122 -166 
2 L5 -068 -848 —:030 
8 | -679 -678 -625 018 
4 | 218 — 1:492 158 -182 


5 —:811 :199 +453 1:660 
6 455. — "247 1-270 +522 
7 | —-107 :598 :808 “701 
8 | 308 —-285 1094 -833 
9 1:015 -387  —-285 :888 

10 -876 1:099 -070 -454 
Six diagrams can be made, and it is advisable to draw 
them all, though not all are necessary. The LN diagram 
is shown in Figure 27. We scan it for collinear points 
(not necessarily radial) which have all or nearly all the other 
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points on one side of their line, and note the line 5, 4, 10, 9. 
Its equation is readily found to be approximately : 
“7381 + 1-327n — -371 = 0. 

We make this homogeneous by substituting for unity, after 
the numerical term :871, the quantity -5 (l + m + n T 2) 
for -5 is the cosine of the angle each of the Landahl axes 
makes with the original first centroid axis. 
the equation (not yet normalized) : 

5531 — -185m + 1-141n — 185p = 0. 
Three more equations are needed, and one of them can 
indeed be obtained from the same diagram, on which 
very nearly collinear. The reader is 


This gives us 


three lines making large angles ( 
with L, M, and P. 

It will be remembered that in our earlier example the 
sign of one equation had to be changed at the end of the 
caleulation because large negative values were appearing 
in the final matrix of loadings. "This can be obviated 
by attending to the following rule, If the other test-points 
are on the same side of the line as the origin the numerical 
term must be positive in the 
equation ; if they are on the 
side remote from the origin 
the numerical term must be 
negative. In the adjacent 
diagram, the origin and the 
other points are on opposite 
sides of the line through 
5, 4, 10, 9 and therefore 
the numerical term must be 
Bigure 27: negative, as it is (—-871). 
t Had it been positive all the 
Signs of the equation would have required to be changed. 
m eae method of reaching simple structure.— 

as pointed out that When simple structure 


u 
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can be attained (whether orthogonal or oblique) then as 
many 7-rowed principal minors of the reduced correlation 
matrix must vanish as there are common factors; and 
that it follows that the same number of vanishing deter- 
minants must be discoverable in the table of centroid 
loadings. Thus, for example, in the table of centroid 
loadings on page 153 the three determinants composed 
respectively of rows 1, 2, and 4; of rows 1, 5, and 6; and 
of rows 3, 4, and 6 all vanish, and these rows are where the 
zeros come in the three columns of the simple structure. 
'This gives an alternative method of reaching simple 
structure. Test every possible r-rowed determinant in 
the centroid table of r factors. If r of them are discovered 
to vanish, then simple structure may be and probably is 
possible. Each of these vanishing determinants will 
provide a column of the rotating matrix A, for which pur- 
pose we delete any one of its rows and calculate all the r—1 
rowed minors from what is left. The column has then to 
be normalized. This process works equally well for 
oblique simple structure (see next Chapter) Its draw- 
back, when the number of factors is large, is the necessity 
of caleulating so many determinants to discover those that 
vanish. 

9. Limits to the extent of factors.*—Orthogonal simple 
structure requires that no factor shall extend through many 
tests, and it is possible to decide beforehand, from the 
correlations, whether factors running through not more 
than s tests each are adequate to give the measured correla- 
tions, leaving » — s zeros. They will not as a rule be able 
to do so if the average correlation exceeds (s — 1)/(n — 1) : 
more exactly, not if the largest latent root of the matrix 
is larger than s. If these rules are to be applied when 
communalities are used, as is the case when testing whether 
orthogonal simple structure is possible, the matrix should 
first be “ corrected for communality," i.e. each r must be 
divided by the square root of the product of the two com- 
munalities concerned. Approximations to the largest 
latent root of a matrix of correlations, when the entries are 
all positive, are— 

* A brief summary of a chapter with this title in previous editions, 


168 THE FACTORIAL ANALYSIS OF HUMAN ABILITY 


sum of the whole matrix 
n 


or more accurately— 


sum of the squares of the column totals 
sum of the whole matrix 


An exact test for the possibility of orthogonal simple 
structure has been given (Ledermann, 1936) and is des- 
cribed in the Appendix, page 867, but it requires a pro- 
hibitive amount of calculation, 

Even, however, when orthogonal si 
be attained with orthogonal factors, 
reach it with oblique factors, 

10. Leading to oblique factors. —In t 
kept our factors orthogonal ; 
correlated with one another, 
to be different qualities, 


mple structure cannot 
it may be possible to 


as it were, overlapped. Yet in situations where more 


e do not hesitate to use 
scribing aman. For instance, 
we give a man’s height and weight, although these are’ 


hich will not 
if orthogonal 


gonality. 
tone expressly 


ome extent in 
I was at one time 


S, and permit factors to 
necessary to attain 
found to be quite 
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highly correlated. A chapter on these oblique factors* is 
therefore necessary, and out of them arise Thurstone's 
“ second order factors.” 

11. Parallel proportional profiles.—A. method which, like 
Thurstone’s simple structure, is meant to enable us to 
arrive at factors which are real entities, or to check 
whether our hypotheses about the factor composition of 
tests are correct, has been put forward by R. B. Cattell 
(1944b, 1946), and has interesting possibilities which its 
author will no doubt develop. The essence of his idea 
is that “if a factor is one which corresponds to a true 
functional unity, it will be increased or decreased ‘as a 
whole’,” and therefore if the same tests are given under 
two different sets of circumstance, which favour a certain 
factor more in one case and less in the other, the loadings 
of the tests in that factor should all change in the same pro- 
portion. Experimental trials of this principle may be ex- 
pected soon from its author. Among * different circum- 
stances " he mentions different samples of subjects, differ- 
ing, say, in age or sex, and different methods of scoring, or 
different associated tests in the battery. But he prefers 
another kind of change of cireumstance ; namely a change 
* from measures of static, inter-individual differences to 
measures from other sources of differences in the same 
variables." He instances, among his examples, inter- 
correlating changes in scores of individuals with time, or 
intercorrelating differences of scores in twins. We may 
thus have two, or several, centroid analyses, and the mathe- 
matical problem is to find rotations which will leave the 
profile of loadings of a certain factor similar in all the factor 
matrices. It may even be that the profiles of several fac- 
tors could be made similar. These factors would then 
satisfy Cattell’s requirement as corresponding to “true 
functional unities.” The necessary modes of calculation 
to perform these rotations have not yet been more than 
adumbrated, however. 

* It must be clearly understood that this obliquity or correlation 


of factors is quite a different matter from the correlation of estimates, 
even of orthogonal factors, due to the excess of factors over tests 


described on pages 287 to 242. 


F.A.— 6* 


jus 


we CHAPTER XII 
OBLIQUE FACTORS 


1. Pattern and structure.—So long as the factors are 
orthogonal, the loadings in the matrix of loadings are also 
the correlations between the factor and the tests, but this 
ceases to be the case when the factors are correlated. The 


word “loading " continues to be used for the coefficients 
such as 1, m, and n in equations like— 


z =la + mB + ny 


and the matrix or table of these is called a pattern, while 
the matrix of correlations between tests and factors is 
called a structure: The entries jn a structure are pio: 
jeattons hom à point on to certain axes, 

pattern are the oblique co-ordinates of t 
those axes. The two are only identical 
orthogonal. 


Moreover, as soon as the factors become oblique 


„it 
becomes necessary to distinguish between " reference 
vectors ” and “ primary factors.” The reference vectors 
ae m eae which the centroid axes have been 
ete 80 that the test=projections on to them include a 
Number of zeros. Tach teference vector is at right angles 


lo à hyperplane containing a number of communality 
vectors, f one dimension less 


In our first i 
Chapter XI the hyperplanes were ordinary SETS 
CUT P Me ; red pyramid there referred to (see 
page and each re erence veetor Was at p 
Ge af those fases: S NBN angles ta 


ne primary factor eorresponding to a alven totecse., 
vector je the lie «r aliti. g 0 à given reference 


3 al ep. 
Lanes excluding, that ps hse pi pius 
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mid where those two faces met, excluding that face to 
which the reference vector was orthogonal. 

Now, when the reference vectors turn out to be at right 
angles to each other, as they did in that example, each 
reference vector is identical with its own primary factor. 
But not when the reference vectors turn out to be oblique. 
In Chapter XI we did not distinguish them, and called their 
common line the **factor." But in this chapter the dis- 
tinetion must be kept clearly in mind. It is the primary 
factors Thurstone wants.' The reference vectors are only 
a means to an end. 

Thurstone’s second method of rotation described in 

Chapter XI, the method in which the communality 
vectors are * extended,” and lines drawn on the diagrams 
which are not necessarily radial lines, will not keep the 
axes orthogonal, but seeks for the axes on which a number 
of projections are zero, regardless of whether the resulting 
directions are orthogonal er oblique, Jn. generat they wilt 
be oblique, ond tie examples worlted in Chapter a anty 
gave orthogonal simple strueture beeause they had been 
devised so as to do so. The test of orthogonality is that 
the matrix of rotation, premultiplied by its transpose, 
gives the unit matrix (see page 160). Or in other words, 
that the inner products of the columns of the rotating 
matrix are all zero. ‘They are the cosines of the angles 
between the referenee veetors, and the eosine of 90° is 
Zero. 
3. Whee obligue facobe.— To illustrate Thurstone’s 
method when the resulting factors are oblique we shall 
next work an example devised to give three oblique 
common factors. Consider this matrix of correlations : 


| 1 2 3 4 5 6 "y 
1 728 67 “B72 153 -105 -126 
2 | 738 696 -588 +651 -BAP 088 
B | 07 696 BOT TTO TOD T40 
4 | 478 
5 828 
E| E 
| 
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which, with guessed communalities, gives these centroid 


loadings : 
F 

iran II III 
l | -449 —:682 -165 
2 | -825 —-478 —-129 
3 | 906 -3386 -020 
4 | -846 -188 457 
5 | -808  .208 —.412 
6 | 1697 886 -885 
7 | 67 — A78 —-468 


When these projections on the centroid axes are “ ex- 


tended," that is, w 
loading in that row, 


hen each row is divided by the first 
we obtain this table : 


laze Jab, — JU. 
1 |1000 —1-519 -867 
2 »  — 579 —-156 
3 $ -371 -022 
4 A 57 — 540 
PEL ss 257 —-510 
6 " 482 -481 
7 x :226 —-610 


The columns II, and III, 


in this table represent the co- 


Figure 28, 


ordinates of the “ dots 
on the ceiling” in our 
analogy of Chapter XI, 
p.157. When we make 
a diagram of them we 
obtain Figure 98. We sce 
that a triangular forma- 
tion is present, and we 
draw the dotted lines 
shown. 

It is not essential, it 
may be remarked in pass- 
ing, that there be no 
points elsewhere than on 
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the lines, provided they are additional to those required to 
fix the simple structure. Had it not been for the desirability 
of keeping the example small we would have increased the 
number of tests, and not only arranged for further points 
to fall on these lines, but also included some whose dots 
fell inside the triangle, representing tests which involve all 


three factors. 
We find the equations of these lines to he approximately 


M5 + 50y + “953 = 0 (line 1, 2, 7) 
1-113 + -183y — 2:119z = 0 (line 1, 4, 6) 
-408 — 1:09ly + -2562 = 0 (line 7, 5, 3, 6) 


The coefficients of each equation have to be “ norma- 


lized,” that is, reduced proportionately so that the sum of 
their squares is unity (for they are to be direction cosines). 
These normalized coefficients are then written as columns 
in a matrix as follows : 


405 464 :338 
:426 076 —:916 | =A 
:809 —-883 :215 


The table of centroid loadings on page 172 must now be 
post-multiplied by this rotating matrix to obtain the 
projections of the tests on the three reference vectors which 
are at right angles to the planes defined by the dotted lines 
in our diagram. We obtain this table : 


V=FA 
(Simple) Structure on the Reference Vectors 
L' B' D' 


:025 ‘O11 :812 
-026 -460 -689 
-526 4.28 -003 
‘769 —:001 +262 
-083 “755 —-006 
-696 053 -000 
006 "182 “000 


TAME AON H 


174 THE FACTORIAL ANALYSIS OF HUMAN ABILITY 


We have labelled the columns L', B', and D' for a reason 
whieh will become apparent later, when we explain how 
the correlations were, in fact, made. This table is a simple 
Structure, formed by the projections on the reference 
vectors. It has a zero (or near-zero) in each row, and 
three or more in each column, in the positions to be 
anticipated from Figure 28; for example, tests 3, 5, 6, 
and 7, which are collinear in the figure, have zeros in 
column D'. 

Now let us test the angles between the reference vectors. 


To do this we premultiply the rotating matrix by its 
transpose 


ANA SAS 


| 
1 —'494 —-079 
=| —-494 1 —103 
“215 | | —:979 —-103 1 
r3 x} L 


NI | 
A05 -426 -809 "405 -464 -338 | 
‘464 076 —-883 ‘426 076 —:916 | 
| +338 — -916 215) “809 —-883 


This gives the cosines of the angles between the reference 
vectors and we see that they are obtuse. 


The angles are 
approximately : 


120° 95° | 
96° | 


120° 
95° 96° 


As soon as we know that the reference vectors are not 
orthogonal, we have to take account of the fact that the 
primary factors are not identical with them. Each prim- 
ary factor is the line in which the hyperplanes intersect, 
excluding that hyperplane to w 


) hich the Corresponding 
reference vector is orthogonal. In 


l a three-dimensional 
common-factor space like ours the primary factors lie 
mos the edges of the pyramid Which the extended vectors 
orm. 
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of the carpet. Figure 28 itself is on the ceiling, seen from 
above as though translucent. The radial lines with 
arrowheads are the projections of the primary factors on 
to the ceiling. The projections of the reference vectors 
are not drawn, to avoid confusion in the figure. They 
are near, but not identical with, the primary factors. 

The reader should not be misled by the fact that two of 
the primary factors lie along the same lines as Tests 1 
and 7. It was necessary to allow this in devising an ex- 
ample with very few tests in it (to avoid much calculation 
and printing large tables). But with a large number of 
tests the lines of the triangle could have been defined 
without any test being actually at a corner. 

od Primary factors and reference vectors.—At about this 
‘stage a disturbing thought may have occurred to the 
reader. We have sought for, and obtained, simple 
structure on the reference vectors. That is to say, we 
have found three vectors, three imaginary tests, which are 
uncorrelated each with a group of the actual tests, namely 
where there are zeros in the table on page 173. The entries 
in that table are the projections of the actual tests on the 


reference vectors. 

But the primary factors are different from the reference 
vectors. The projections of the tests on to the primary 
factors will be different and will not show these zeros. 
Those projections are, in fact, given in this table (never 


mind for the moment how it is arrived at) : 


F(A’) >D 
Structure on the Primary Factors 
L B D 
^1 | 460 162 -832 
2 -408 -666 -793 
3 :866 -809 "176 
4 "934 -495 -401 
5 -541 -927 -152 
6 | 842 ATA 132 
7 | -:408 :915 :150 
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These numbers are the correlations between the iden 
factors and the tests, and none is zero. The primary 
factor structure is not “ simple," it is the reference vector 
structure that is simple. Why then not use the reference 
vectors as our factors ? ; iG 

A two-fold answer can be given to this, one general, th 
other particular to this example. The latter will ae: 
clear when we divulge how the example was made. e 
former requires us to return to the distinction between 
structure and pattern. A structure is a table of correlations, 
a pattern is a table Of coefficients in a “ specification 
equation specifying how a test score is made up by factors. 
The entries in a pattern are loadings or saturations of the 
tests with the factors, but not correlations. 

Pattern and structure are only identical when the 
reference vectors are orthogonal and coincide with : the 
primary factors. When the reference vectors are oblique 
(usually at obtuse angles) the primary factors are different 
and are themselves usually at acute angles. When the 
primary factors and reference vectors thus separate, the 
Structure of the reference vectors and the pattern of the primary 
factors are identical except for a coefficient multiplying 
each column ; and vice versa the structure of the primary 
factors is identical (except for similar coefficients) with the 
pattern of the reference vectors. In particular, where 
there are zeros in the reference vector structure there will 
also be zeros in the primary factor pattern. The general 
theorem of the reciprocity of reference vectors and primary 
factors (to use our present terms), that is, the reciprocity 


of (a) planes, and (b) 
ons in each case 


It occurs in several other places in the 
geometry of factorial analysis : 


for instance, tests, persons, 

and factors are all in one sense reciprocal and exchange- 
able. 

The particular fact about th 


€ zeros in the primary factor 
pattern can be seen readily fr 


om the geometrical analogy. 


= 
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For a test vector which lies in a hyperplane can be com- 
pletely defined as a weighted resultant of the primary 
factors which are also in that hyperplane, without any 
assistance from other primary factors. In our drawing 
of the reader's study, for example, on page 157, the vector 
of the Test 2, which is the line O2, lies upon the plane 
face O14 of the pyramid, and can be completely described 
by a weighted sum of the primary factors along the edges 
O1 and O4, without bringing in the edge 06 at all. The 
primary factor which lies along that edge will therefore 
have a zero weight in the row of the pattern which speci- 
fies Test 2. This pattern on the primary factors will be 
very similar to the structure on the reference vectors 
already given for our example in the table on page 173. 
It can, in fact, be calculated from that table by multiplying 
the first column by 1:163, the second column by 1:166, 
and the third by 1-017, giving the following : 


FAD“ 
(Simple) Pattern on the Primary Factors 
| L B D 
l--o29 ^ -018 . 828. 


.030 -536 701 
| «612 -499 -003 
| .g95 —-001 :266 
| -096 -880 —-006 

-809 -062 000 

008 — -912 -000 


"006 


Thus although the primary factors differ from the 
reference vectors (the angles between the primary factors 
and their corresponding reference vectors are, in fact, 81^, 
315, and 11°), yet if the structure on the reference vectors 
is “ simple," the pattern on the primary factors will be 
* simple.” The entries in the above table can be used as 
coefficients in specification equations, and if for clearness 
we omit the near-zero coefficients entirely, we have found 


that the test scores can be considered as made up thus : 
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Score in Test 1 = -826d + Specific 
= 586b + -701d + 
“6121 + -499b -+ 
“8951 + -266d + 
*880b + 
-8097 + 
= -912b + 


4. Behind the scenes.—It is now time to divulge what 
these "tests? really are and how the “scores? were 
made whose correlations we have been analysing, and to 
compare our analysis with the reality. The example is a 
simpler and shorter variety of a device used by Thurstone 
and published in April 1940 in the Psychological Bulletin. 
The measurements behind the correlations were not made 
on a number of persons, but were made on a number of 
boxes—only eight boxes, to keep down the amount of 


calculation and printing. These boxes were of the follow- 
ing dimensions : 


ll 


I 


-3 €» OU & wr 
ll 


| Length Breadth Depth 
1 2 2 1 
2 3 2 3 
3 3 2 2 
4 6 3 2 
5 E 4 2 
6 5 3 1 
Tí 5 4 3 
8 4 4 2 
Sum, 32 24 16 
Mean 4 3 2 


The * tests? were seven function: 
and are shown in the next 


boxes (as we are unable to m 
of the mind directly) but was 
complex quantities like LB, or 4/(L2 + D?) (as we are 
able to measure Scores in com 
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Test | Formula| 1 2 8 4 5 6 7% 8 | Sum | Mean 


36 | 4-500 


1 | 2 1, .D. 4 MEL, 2 

ee BD) 2 6 4 6 8 8 12 8| 49) 0125 
3 | LB 4 6 6 18 16 15 20 16| 101 |12-625 
4 | VTD?) 2-24 4-24 3:61 6-32 4-47 5:10 5-83 4-47) 36:28 | 4-585 
5 | LLB | 6 v v 15' 20 14 21 20| 110 (18790 
6! I+D | 5 12 11 38 18 26 28 18 | 156 | 19-500 
"vr oa 2 9 2 8 4 8. 4.4]. 124///8:000 


With these scores the sums of squares and products of 
deviations from the mean are : 


| 1 2 3 4 5 6 uf 
i 66 50:5 22:5 10:2 25 29 3 
2 50-5 72:9 984 168 1123 100:5 16 
3 299.5 98-4 273:9 47-9 259-2 398-5 36 
4 | 102 168 47-9 114 387-0 913 47 
5 25 1123 2592 37.0 283:5 288 41 
6 29 100.5 398:5 91-3 288 800 36 
ti 3 16 36 47 4d 36 6 


From these the correlations could be calculated by dividing 
cach row and column by the square root of the diagonal 
cell entry. But that would make no allowance for specific 
factors, which in all actual psychological tests play a 
considerable part. In the example devised by Thurstone 
on which this is modelled there are no specific factors, but 
it was decided to introduce them here into Tests 5, 6, and 7, 
by increasing their sums of squares. In addition, by an 
arithmetical slip, a small group factor was added to these 
three tests, and this was not discovered for some time. It 
was decided to leave it, for in a way it makes the example 
more realistic, and may be taken to represent an experi- 
mental error of some sort running through these three tests. 

With these changes, the correlations are found, and are 
those with which we began this chapter and which we have 
already analysed into three oblique factors L, B, and D. 
Let us now compare that analysis with the formule which 
we now know to represent the tests. The pattern on 
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page 177, for example, shows that Test 2 depends only on 
factors B and D: and that is correct, for it was, in fact, 
their product BD, and L did not enter into it. The 
analysis gives the test score as a linear function of B and D, 


:586b + -701d 


whereas it was really a product. But the analysis was 
correct in omitting L. Similarly, the analyses into the 
other factors can be compared with the actual formule, 
and in almost every case the factorial analysis, except for 
being linear, is in agreement with the actual facts. Tests 5 
and 6, true, appear in the analysis to omit factors L and D 
respectively, although these dimensions figured in their 
formule. But it would appear that they were swamped 
by reason of the other dimension in the formule being 
Squared; and also possibly the specific and error factors 
we added did something towards obscuring smaller details. 
Also the process of “ guessing ” communalities, though 
innocuous in a battery of many tests, is a source of con- 
siderable inaccuracy when, as here, the tests are few. 

5. Bow dimensions as factors.—We can now explain the 
particular reason for selecting the primary factors, and not 
the reference vectors, as our fundamental entities. The 
fundamental entities in the present example can reason- 
ably be said to be the length, breadth, and depth of the 
boxes, given in the table on page 178. Now, the columns 
of that table are correlated with one another, as the reader 
can readily check, the correlation coefficients being— 


L with B, -589 
Hj. n D, -144 
B » D, -204 


These correlations are due to the fact that a long box 
naturally tends to be large in all its dimensions, Jt could, 


of course, be very, very shallow, but usually it is 
broad. 


The reference vectors were, it is true, correlated, but 
negatively. They were at obtuse angles with one another 
(see page 174) and obtuse angles have negative cosines 
Corresponding to negative correlations. So the reference 


deep and 


OBLIQUE FACTORS 181 


vectors do not correspond to the fundamental dimensions 
length, breadth, and depth. 

What, then, are the angles—and hence the correlations— 
between the primary factors? We shall find that they 
are acute angles, and their cosines agree reasonably well 
with the above correlations between the length, breadth, 
and depth. The algebraic method of finding these angles 
is given in the mathematical appendix, but it is perhaps 
desirable to give a less technical account of it here. We 
need the direction-cosines of the primary factors, that is, 
the cosines of the angles they make with the orthogonal 
centroid axes. Each primary factor is the intersection 
of n — 1 hyperplanes—in our simple case is the intersection 
of two planes. 

In n-dimensional geometry a linear equation defines a 
hyperplane of n — 1 dimensions. For example, in a plane 
of two dimensions a linear equation is a line (of one dimen- 
sion)—hence the name linear. But in a space of three 
dimensions a “ linear” equation like aw + by + C8 = d 
is a plane. Two such equations define the line which is 
the intersection of two planes. 

Now, the equations of the three planes which form the 
triangular pyramid of which we have previously spoken 
are just those equations we have already obtained and 
used in our example, viz. : 

405a + -426y + 8092 = 0 
4640 + -076y — ‘883z = 
3380 — -916y + ‘215z = 


These equations taken two at a time define the three 
edges of the pyramid, which are our primary factors, and 
if we express each pair in the form— 


gl. y 


au 


then the direction cosines are proportional to a, b, and c, 
which only require normalizing to be the direction cosines. 
When the direction cosines are found in this way, and 
written in columns to form a matrix, they prove to have 


the values— 


z 
e 
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3797. :885 -503 
400 187 —-843 | = (AyD 
453. —-517 :192 


This is the rotating matrix to obtain the projections, i.e. 
the structure, on the primary factors, and if the centroid 
loadings on page 172 are post-multiplied by this there 
results the table we have already quoted on page 175. 

The above matrix, premultiplied by its transpose, gives 


the cosines of the angles between the primary factors. We 
obtain— 


1 -506 "150 
506 1 164 |= DCD 
:150 164 i 


Compare these with the correlations 
of dimensions of the boxes, viz. : 


between the columns 


1 -589 "144 
589 1 204 
“144 :204 i 


The resemblance is quite good, and shows that it is the 


primary factors, and not the reference vectors, which 
Tepresent those fundamen 


ase of the boxes (and 
example, Thurstone, 1944a, 
ctors), it may be presumed to 


s when it is applied to mental 
measurements. And I confess that the argument is very 
Strong. 


simple structure 


(oblique if necessary) ean be reached and the factors 
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identified. But are there other ways in which the test 
scores could have been made? Spearman's argument was 
a similar reversal. If test scores are made with only one 
common factor, then zero tetrad-differences result. But 
AEN) tetrad-differences can be approached as closely as we 
like by samples of a large number of small factors, with 
very few indeed common to all the tests. 

However, Thurstone’s simple structure is a 
complex phenomenon than Spearman’s hierarchical order, 
and yet he seems to have had no great difficulty in finding 
batteries of tests which give simple structure to a reason- 
able approximation. Iam not sceptical, merely cautious, 
and admittedly much impressed by Thurstone’s ability 
both in the mathematical treatment and in the devising 
, of experiments. 

Thurstone might, I think, put hi 
assembles a battery of tests whic 
intuition appear to contain such a 
factors, some being memory tests, 
ete., no test, however, containing (to his mind) all these 
expected factors. He then submits their correlations to 
his calculations, reaches oblique simple structure, and 
compares this analysis with his psychological expectation. 
If there is agreement, he feels confirmed both in his psy- 
chology and in the efficacy of his method of finding factors 
mathematically. Usually there will not be complete 
agreement, and he is led to modify his psychological ideas 
somewhat, in a certain direction. To test the truth of these 
further ideas he again makes and analyses a battery. 
Especially he looks to see if the same factors turn up in 
various batteries. He uses his analyses as guides to 
modifications of his psychological hypotheses, or as con- 
firmation of them. In Great Britain Thurstone's hypo- 
thesis of simple structure has been, I think it is correct to 
say, rather ignored than criticized. Most British psycho- 
logists have imbibed during their education a belief in and 
a partiality for “ Spearman’s £g," a factor apparently 
abolished by Thurstone. Since his work on second-order 


factors rehabilitates g, this objection may disappear. 
Reyburn and Taylor of South Africa have, however, 


much more 


s case in this way. He 
h to his psychological 
nd such psychological 
some numerical, etc., 
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criticized simple structure shrewdly (1943a, and a later 
paper by Reyburn and Raath, 1949) even although they 
themselves do not insist on a g (see 1941a, pages 258, 254, 
258). 

An early form of response to Thurstone's work was to 
Show that his batteries could also be analysed after Spear- 
man's fashion. Holzinger and Harman (1938), using the 
Bifactor method, reanalysed the data of Thurstone’s 
Primary Mental Abilities and found an important general 
factor due, as they truly say, “to our hypothesis of its 
existence and the essentially positive correlations through- 
out.” Spearman (1939a) in a paper entitled Thurstone s 
Work Reworked reached much the same analysis, and raised 
certain practical or experimental objections, claiming that 
his g had merely been submerged in a sea of error. But 
there is more in it than that. As I said in my contribution 
to the Reading University Symposium (1939) Thurstone 
could correct all the blemishes pointed out by Spearman 
and would still be able to attain simple structure. I said 
on that occasion that however juries in America and in 
Britain might differ at present, the larger jury of the future 
would decide by noting whether Spearman’s or Thurstone’s 
system had proved most useful in the hands of the prac- 
tising psychologist. I now think that they will certainly 
also consider which set of factors has proved most invariant 
and most real. Very likely the two criteria may lead to 
the same verdict. But for the present the two rival claims 
are in the position described by the Scottish legal phrase, 
“taken ad avizandum.” 

7. Application of multiple-factor anal 
data.—Dr. R. Harper, with various c 
these methods of factor analysis, begun in connexion with 
psychological tests, to tests of a physical kind on various 
substances during their manufacture. In Nature of 
November 20th, 1948, Harper and Baron wrote; “ In 
industrial physics there a ions when empirical tests 

f which is not fully under- 
stood, and where the į ionships between the tests 
could profitably be studied by similar means ” to those 
used in psychology, and they described a centroid analysis, 


ysis to industrial test 
0-workers, has applied 


re occas 
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without rotation, of rheological measurements on cheese. 
In the British Journal of Applied Physics of January, 1950, 
Harper, Kent, and Blair gave an account of the factorial 
analysis of ten tests (seven rheological and three electrical) 
on a group of plastics (polyvinyl-chloride-plasticizer mixes). 
They made a centroid analysis, with four iterations, took 
out three factors, and rotated them orthogonally to 
maximize the number of near-zero loadings. They tried 
also other rotations, including one to an approximate 
oblique simple structure, and suggest interpretations of the 
factors arrived at. 


CHAPTER XII 
SECOND-ORDER FACTORS 


1. A second-order general factor—The reason why the 
factors arrived at in the “ box” example were correlated 
was that large boxes tend to have all their dimensions 
large. There is a typical shape for a box, often departed 
from, yet seldom to an extreme degree. Therefore the 
length, breadth, and depth of a series of boxes are corre- 
lated, and so also are Thurstone’s primary factors in such 
a case. There is a size factor in boxes, a general factor 
which does not appear as a first-order factor (those we 
have been dealing with) in Thurstone’s analysis, but 
causes these primary factors to be correlated. Possibly, 
therefore, when oblique factors appear in the factorial 
analysis of psychological tests, there is a hidden general 
factor causing the obliquity. This factor or factors (for 
there might be more than one) can be arrived at by analys- 
ing the first-order factors, into what Thurstone calls 
second-order factors, factors of the factors. 

Of course, whether such a procedure could be justified 
by the reliability of the original experimental data is very 
doubtful in most psychological experiments. 
structure of theory and caleulation raised upon those data 
is already, many would urge, perhaps rather top-heavy, and 
to add a second storey unwise. But we should not, I think, 
let this practical question deter us from examining what is 
undoubtedly a very interesting and illuminating suggestion, 
d de an turn out to be the means of reconciling and 

grating various theories of the structure of the mind. 
e d Xn py primary factors of our “ box ” example of 
apier , zy 7 Were correlated as shown in this matrix : 


The super- 


1 :506 “150 
| -506 1 164 
| “150 164 1 
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If we analyse these in their turn into a general factor 
and specifies we obtain, using the formula— 


- Tala 
g saturation = ( E 2) à 


bc 


the saturations of the primary factors with a second-order 
g as -680, -744, and -220; and each primary factor will 
also have a factor specific. We haye now replaced the 
analysis of the original tests into three oblique factors by 
an analysis into four orthogonal factors, one of them 
general to the oblique factors and presumably also general 
to the original tests, though that we have still to inquire 
into. We must also inquire into the relationship of the 
specifies of the original tests to these second-order factors, 
which are no longer in the original three-dimensional 
common-factor space, but in a new space of four dimen- 
sions. Are the original test-specifics orthogonal to this 
new space ? 

$ With only three oblique factors, an analysis into one g 
is always possible (except in the Heywood case, which will 
often occur among oblique factors). If there had been 
four or more oblique factors, we would have had to use more 
second-order general factors unless the tetrad-differences 
Thurstone’s “ trapezium ? example already 


were Zero. 
article should 


referred to had four oblique factors, and his 
be consulted by the interested. 

2. Its correlations with the iesis.—Let us turn now to the 
question what the correlations are between the seven 
original tests and the above second-order g. To obtain 
these Thurstone uses an argument equivalent to the fol- 
lowing : 

We may first note that cach reference vector makes an 
acute angle with its own primary factor, but is at right 
angles to every other primary factor, for these are all 
contained in the hyperplane to which it is orthogonal. 
The cosines of the angles can be obtained by premulti- 
plying the rotation matrix of the reference vectors by 
the transpose of the rotation matrix of the primary 


factors. 
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Correlations between Primary Factors and Reference Vectors 


DA* x A=D 
li, ESI 3 | 
197 400 -453 405 -464 338 | S860 . SM 
3885 -187 —-517 426 -076 —916| =| . -858 . 
503 —-843 -192 809 —-883 -215 +983 


These cosines in the diagonal of the matrix D give us the 
angles 31°, 31°, and 11° which we have already mentioned 
on page 177 as the angles between each primary factor and 
its own reference vector. 

Each row of the first of the above matrices represents 
the projections of the primary factor on to the orthogonal 
centroid axes. These are, in fact, the loadings of the prim- 
ary factors, thought of as imaginary or possible tests, 
in the orthogonal centroid factors I, I, and III. Following 
Thurstone, we add these three rows below the seven rows of 


our original seven real tests, extending the matrix F in 
length thus : 


I II III jr; 
uU 449 —.682 :165 :211 
2 :825  —.478 —:129 "574 
3 -906 :386 *020 "UDBT 
4 :846 133 "ADT *666 "wanted 
5 -808 "208 —-412 "719 
6 697 :886 B35 -597 
of 767 173 —-468 683 
L 797 *400 453 *680 
B 835 187  —.517 "144 [ino 
D 503 —.843 :192 :220 


post-multiply by 
tion) to give the 
L, B, and D, with the s 
want to know by what 
tiplied so that the weig 
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tion of that test with g. Suppose these weights are u, v, 
and w. Since we already know from our second-order 
analysis what 1, is for each of the primaries L, B, and D, 
we have three equations for u, V, and w, the solution of 
which gives us their values. We have— 

“797u + -4000 + -458w = :680 

*835u + -1870 — -517w = 744 

-508u — -8480 + -192:0 = :220 
and these equations can be solved in the usual way, if 
the reader wishes. The values are :798, 198, and —:077. 

A closer examination of them, however, which can be 
most readily expressed in matrix notation, leads to an 
easier plan—especially desirable if the number of primary 
factors were greater. In matrix form the above equations 
are— 
TU) =7, 
whence » =T"7, 
and since T is merely a short notation fo 
y= (DAV, 
= AD", 

That is to say, the centroid loadings F of the seven tests 
have to be post-multiplied by this, giving a matrix (a 
single column)— 


r DA? we have— 


Fy =FAD™, 

But FA we already know. It is (see page 178) the simple 
structure V on the reference vectors. So we merely have 
to multiply the columns of V by D^, and add the rows to 
get the correlation of each test with g. These multipliers 
are, that is to say : 

-680 —- -860 = 791 

“744 + -858 = 7867 

+220 — -983 = :224 
The results are the same as by the former method, except 
for discrepancies due to rounding off decimals, and are 
given to the right of the preceding table. 

3. A g plus an orthogonal simple structure.—1n his own 

examples, Thurstone has not calculated the loadings of the 


original tests with the other orthogonal second-order 
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factors, the factor specifics. This can, however, clearly bs 
done by the same method as above. Since the coge d 
of the general factor with the three oblique factors n 

:680, 744, and -220, the correlations of each factor speci d 
with its own oblique factor are 7783, -668, and -975. For 


example, -788? — 1 — .880?. The second-order analysis 
therefore is : 


| = 


-680 "788 : 
TAA : -668 : =H 
220 b 5 ‘975 


Dividing the rows by the divisors already mentioned, viz. 
“860, -858, and 983, we obtain the matrix : 


"791 “853 E 
| "867 A “779 "(=DE 
| -224 : 3 +992 


and when the matrix V is post-multiplied by this we 
obtain the following analysis of the original seven tests 


into a general factor plus an orthogonal simple structure 
of three factors : 


General Factor plus Simple Structure 


G = VDE 
g D B à 
1 :211 :021 *009 -805 
2 "574 "022 -358 -683 
8 ‘787 "449 333 —-006 
4 -666 (6656 —-001 -260 
5 19 071 588  —.006 
6 -597 -593 *041 *000 
a 683 005 -609 -000 


The zero or Very small entries in A, 
same places as they are for L', B’ 
simple structure V ( 
done is to analyse 


B, and § are in the 
» and D' in the oblique 
See page 178). What we have now 


the box data into four orthogonal 
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factors corresponding to size, and ratios of length, breadth, 
and depth. In terms of our pyramidal geometrical 
analogy we have “ taken out a general factor” by depress- 
ing the ceiling of our room, squashing the pyramid down 
until its three plane sides are at right angles to each other. 

The above structure, being on orthogonal factors, is also 
a pattern, so that the inner products of its rows ought to 
give the correlation coefficients with the same accuracy, if 
we have kept enough decimal places in our calculations, as 
do the rows of the centroid analysis F: and so they do. 
For example, the correlation between Tests 1 and 2 is, 
from F, 

449 x -825 -+ -682 x -478 — 165 x +129 = -675 
and from G it is— 
“211 x +574 + -021 x 022 + -009 x 358 + -805 x 683 —:675 - 
The * experimental" value was -728, the difference of 
:053 being due to the inaccuracy of the guessed com- 
munalities, or in an actual experimental set of data to 
sampling error and to the rank of the matrix not being 
exactly three. 

We can see here a distinct step towards a reconciliation 
between the analyses of the Spearman school and those 
of Thurstone using oblique factors. But we must not 
forget that if the oblique factors are not oblique enough, 
the Heywood embarrassment will occur, and a second- 
order g be impossible. The orthogonal factors of G are 
more convenient to work with statistically, but it is possible 
that the oblique factors of V are more realistic both in our 
artificial box example and in psychology. They corre- 
sponded in our case to the actual length, breadth, and 
depth of the boxes. The factors A, B, and 8 of matrix G 
correspond to these dimensions after the boxes have all 
been equalized in “ size.” 


PART IV 
THE ESTIMATION OF FACTORS * 


* This use of the word “estimation” has been criticized. By 
statisticians the word is restricted to mean the estimation of un- 
known parameters from a sample, a process of inference from 
sample to parent population. Here the word is used to mean the 
“estimation ” of a man’s scores in à test (or vocation or examina- 
tion) to which he has not been subjected, from a knowledge of his 
behaviour in other tests. Factors are imaginary tests and a man's 
score in them can be “estimated ” in the same way. I would use 
another word if I could, but “estimation ” seems the natural ex- 
pression. Besides, I think the two meanings are fundamentally 


alike. 


F.A—7 


CHAPTER XIV 


REGRESSION AND MULTIPLE CORRELATION 


1. Correlation coefficient as estimation coefficient—A corre- Urns 
lation coefficient indicates the degree of resemblance Coef- : 


between two lists of marks : and therefore it also indicates 


Ayes 
Bk 


If the correlation between “two. lists is perfect (r = 1), 
we know that his standardized score* in the one list is 
exactly the same as in the other (x = y). 


If the correlation between the two lists is zero (r = 0), ' 


then the knowledge of a man's position in the one list tells 
us nothing whatever about his position in the other list. 
If we are compelled to make an estimate of that, we can 
only fall back on our knowledge that most men are near 
the average and few men are very good or very bad in any 
quality. We have, therefore, most chance of being correct 
if we guess that this man is average in the unknown test. 
(rv =0. The average mark we have agreed to call zero ; 
marks above average, positive; marks below average, 
negative.) 

In the first case, when r = 1, we are justified in equating 


his unknown score œ to his known score y— 
Erit) 
In the second case, when r = 0, we are compelled by our 


ignorance to take refuge in— 4 
a = 0 or average. 
Both these statements can be summed up in the one 


statement— $ —ry 
where the circumflex mark over the a is meant to indicate 


that this is an estimated, not a measured, value. If, now, 
* A test score in what follows always means a standardized score 
unless the contrary is stated. But estimates are not in standard 


measure in general. 
195 
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we consider a case between these, where the correlation is 
neither perfect nor zero, it can be shown that this equation 
still holds, provided each score is measured in standard 
deviation units. Since r is always a fraction, this means 
that we always estimate his unknown @ score as being 
3 ‘nearer the average than his known y score. That is 
because we know that men tend to be average men. If 
this man’s y score is high, say— 


y =2 
(two standard deviations above the average), and if the 
correlation between the qualities æ and y is known to be 
r = +5, we guess his position in the v test as being— 
(9 qq e di sp ecl 

i.e. only one standard deviation above the average. "This 
is a guess influenced by our two pieces of knowledge, 
(1) that he did very well in Test y, which is correlated with 
Test x, and (2) that most men get round about an average 
score (zero). It is a compromise, an estimate. It will 
often be wrong; indeed, very seldom will it be exactly 
right. But it will be right on the average, it will as often 
be an underestimate as an overestimate, in each array 
ze of men who are alike in y. The correlation coefficient, 

\_,)*then, is an estimation coefficient for tests measured in 
Standard deviation units. 

. 2. Three tests——Suppose now that we have three tests 
w whose intercorrelations are known, and that a man’s scores 
on two of them, y and z, are known. We wish to estimate 
what his score will most probably be in the other test, a. 
t need not be a test in the ordinary sense of the word, but 
may be an occupation for which the man is a candidate 
or entrant. According as we use his known y or his 
known z score, we shall have two estimates for his æ score. 


To fix our ideas, let us take definite values for the correla- 
tions, say : 


-_ | OO!) 
— — Rp mm ——  — 


. want to find the correlati 
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The two estimates for his æ are then— 


$ = Ty 


AEE 
& = -5z 


and of these we shall have rather more confidence in the 
estimate associated with the higher correlation. But we 
ought to have still more confidence in an estimate derived 
from both y and z. Such an estimate could usé not only 
the knowledge that y and z are correlated with æ, but also 
the knowledge that they are correlated to an extent of 
r —-8 with each other. Just to take the average of the 
above two separate estimates will not utilize this knowledge, 
nor will it utilize the fact that the estimate from y (r = 7) 
is more worthy of confidence than the estimate from 


2 (7 = 5): 
What we want is to know how to combine the two scores 


y and z into a weighted total— 
(by + cz) 


which will have the highest possible correlation with æ. 
Such a correlation. of a best- weighted total with another 
test is called a multiple correlation. From such a weighted 
total of his two known scores we could then estimate the 
man’s æ score more accurately than from either the y or 
the z score alone. It must use all the information we have, 
including our information that y and z correlate to an 
amount — ‘8. 

3. The straight sum, and the. pooling square.—In order to 
answer this question, we shall first consider the problem 
of finding the correlation of the straight unweighted sum 
of the scores y + z with a. This is the simplest form of a 
problem to which a general answer was given by Professor 
Spearman (Spearman, 1913). 

We shall put his formula into a very simple form, which 


we may call a pooling square. In our present instance we 
on of y +2 with æ (all of these 


being, we are assuming, measured in standard deviation 
units. We divide the matrix of correlations by lines 
separating the “ criterion " æ from the “ battery ” y + z 


thus : 
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x y z 

J F l 
x | 10 4 ow 
y my eo cras) 
z 5 3 1-0) | 


In each of the quadrants of this pooling square (with 
unities in the diagonal, be itynoted) we are going to form 


the sum of all the numbers, and we shall indicate these 
sums by the letters : 


(6. B 


(where C is the sum of the Cross-correlations between the 
battery y + z and the criterion æ, which can be regarded 
as a second battery of one test only). 

Then the correlation of z with y 


+ z is equal to— 


C 
VAB 
which in our present example is— 
T + -5 1:2 


MOGs eee) 4/26 744 


is case get a better estimate 


get from either alone. 

4. The pooling square with weights. —We want, however, 
to know whether a weighted sum of y and z will give a still 
higher combined correlation With æ. With sufficient 
patience, we could answer this by tria] and error, for the 
pooling square enables us to find almost as easily the 
correlation of a weighted battery with the criterion.* Let 
us, for example, try the battery By + z. 


For this purpose 

* The pooling square can aj 
covariances of weighted p 
developments are Hotelling's 
(1935a) and of vector corre] 


Iso be used to find 
atteries with one 

ideas of the most p 
ation (1936) 


the correlations or 
another. Elegant 
redictable criterion 


ee 
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we write the weights along both margins of the pooling 
square : 


3 1 

1:0 UC 5 

3 prj 1:0 3 
1 5 3 1-0 


and multiply both the rows and the columns by these weights 
before forming the sums 4, B, and C. The result of the 


multiplications in our case is : 


1:0 2:6 
2-6 11:8 
and we therefore have— 
: 2:6 
correlation = ——~ = ‘757 
ARES T. 


a higher value than -744 given by the simple sum. So we 
have improved our estimation of the man's @ score, and 
estimates made by taking 3y + 2 would correlate -757 
with the measured values of a. 

5. Regression coefficients and multiple — correlation. — 
Similarly we could try other weights for y and z and search 
by trial and error for the best. There is, however, a general 
answer to this question, namely that the best weights for 
y and z are proportional to certain minor determinants of 
the correlation matrix. The weight for y is proportional to : 
the minor left when we cross out the criterion column and 
the y row, the weight for z is proportional to minus the 
minor left when we similarly cross out the criterion column 
and the z row. The matrix of correlations with the 


criterion column deleted being : 
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the weight for y is therefore proportional to : 


a 5 
E95 
Li SI^ O 
and that for z is proportional to : 
T 05 " 
= —.99 
1:0 3 


that is, they are as -55:-29. To make these weights not 
merely proportional but absolute values we must divide 
each of them by the minor left when the row and column 
concerned with the “ criterion ” æ are deleted, namely : 
1:0 3 

3 1-0 
so that these absolute best weights, for which the technical 
name is “ regression coefficients," are— 


= 91 


or -6044y +. -3187z 

We are inviting the reader to take this method of c 
ing the regression coefficients on trust ; 
satisfy himself that when applied to the pooling square they 
give a higher correlation of battery with criterion than any 
other weights do. The result of multiplying the y column 


and row by -6044, and the z column and row by +3187, is 
the following : 


aleulat- 
but he can at least 


6044-3187 
| KN | | EN Mest aa iboa' 
a ee To o - Sean | -Süb o 
3187 l 5 | 3 1-0 | | 3508 | -0578  -1015 | 
1000 | 5824 
QUEGLI NE e 


Multiple correlation — MEE 


is higher than any other weighting will produce, if the reader 
cares to try others, Noti 
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(-5824 = -5824). We can deduce that the inner product of 
the regression coefficients with the correlation coefficients 
gives the square of the multiple correlation— 


604 X 7 + -B19 x :5 = 588 = n? 


Indeed, we can take this as forming one reason for using 
-604 and -319, and not any other numbers proportional to 
them, although the latter would give the same order of 
merit. We want our estimates of œ not merely to be as 
highly correlated with the true values of æ as is possible, 
but also to be equal to them on the average in the long 
run, in the sense that our overestimations will, in each 
array of men who have the same y and z, be as numerous 
as our underestimations, and this is achieved by using not 
merely -55 and :29 as weights, but -55 + :91, and :29 — -91. 

6. Aitken’s method of pivotal condensation.—When there. 
are more than two tests y and z in the battery, the applica- 
tion of the above rules becomes increasingly laborious. It 
is desirable, therefore, to have a routine method of calcu- 
lating regression coefficients which will give the result as 
easily as possible even in the case of a team of many tests. 
The method we shall adopt (Aitken, 19374) is based upon 
the calculation of tetrads, as already used in our Chapter V. 
We shall first caleulate the above regression coefficients 


again by this method. Delete the criterion column in the dastoy 


matrix of correlations, transfer the criterion rozo to the bottom, 
and write the resulting oblong matrix in the top left-hand 
corner of the sheet of calculations, preferably on paper 


ruled in squares : 
Check 


Column 
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On the right of the oblong matrix of correlation coeffi- 
cients we rule a middle block of columns of the same 
number, here two, and on the right of all 
The columns of the middle block we fill with a pattern 
of minus ones diagonally as shown, leaving the other cells 
empty,* including the bottom row. In the check column 
we write the sum of each row. The top left-hand number 


of all we mark as the “ pivot.” Slab B of the calculation 
is then formed from slab 4 by writing down, in order as 
they come, all the tetrad-differences of which the pivot in 
A is one corner. 


Thus the first row of slab B is calculated 


a check column. 


thus— 
xX 1 — -3x PI 91 
Ux OSB (Ss) us 
1x(—1)—.8x 0— —1 
1L 3 — 3x dB c :21 


and the row is checked by noting that -21 is the sum of the 
others. Immediately below this first row a second version 
of it is written, with every member divided by the first 
(91). This is to facilitate the calculation of slab C by 


having unity again as a pivot. The second row of slab B is 
then formed, beginning with— 


1 X 5 — -7 X -8 —.99 
Throughout the w 
of the first row, onl 


the pivot. 


The same Operation is then 
using the modified first row of 


example, this } 


middle block then gives the 


319, with their 
- the calculation the check c 


k 
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tetrad from the sums in the previous slab agrees with the 
sum of the new row. Thus -99 is both the sum of its row, 
and also the tetrad— 

1xX12—-7x 3 
from slab A. 

When the number of tests in the battery is large, the 
calculation of the regression coefficients is a laborious 
business, but probably less so by this method than by 
any other. It will be clear to the reader that so long a 
calculation is not worth performing unless the accuracy of 
the original correlation coefficients is high. Only very 
accurate values can stand such repeated multiplication, 
ete, without giving untrustworthy results. In other 
words, regression coefficients have a rather high standard 
error.* 

7. A larger ezample.—Next we give in full the calculation 
of the regression coefficients in a slightly larger example, 
though one still much smaller than a practical scheme of 
vocational advice would involve. Here 2 is the **occu- 
pation,” and z, Z» z, and z, are tests. To give the 
example an air of reality, these and their intercorrelations 
are taken from Dr. W. P. Alexander's experimental study, 
Intelligence, Concrete and Abstract (Alexander, 1985). 


They were T : 


z, Stanford-Binet test ; 
z, A picture-completion test ; 
Thorndike reading test ; 


3 
Spearman's analogies test in geometrical figures. 


54 
* Regression weights obtained from one set of data, applied to a 


t set, will not usually give a correlation with the criterion 


subsequen 
probable defect in its square will 


as high as that predicted. The 
be (Wherry, 1931)— 

a — M — 1)/(N — M), 
where N is the number of persons and M the number of tests. 

T In this, as in other instances where data for small examples are 
taken from experimental papers, neither criticism nor comment is 
in any way intended. Illustrations are restricted to few tests for 
economy of space and clearness of exposition, but in the experiments 
from which the data are taken many more tests are employed, and 


the purpose may be quite different from that of this book. 
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But the occupation is a pure invention, for purposes of this 
illustration only. "The correlation matrix is : 


Zo 2 EA Z3 4 
% | 1.00 "2 63 “58 E! 


ZA 


S2 1:00 :39 *69 -49 
63 89 — 1:00 :19 27 
58 69 ‘19 1-00 :38 
“41 -49 27 838 1:00 


eoa 


N WR 


eS 


The fact that we possess these correlations means that we 
have given these tests to a sufficiently large number of 
persons whose ability in the occupation is also known. 
The occupation can be looked upon as another test, in 
which marks can be scored. In an actual experiment, 
obtaining marks for these persons’ abilities in the occupa- 
tion is in fact one of the most difficult parts of the work. 
We can now find by Aitken’s method the best weights for 
Tests z, to z, to make their weighted sum correlate as 
highly as possible with 7» For a reason which will be 
explained later, I have numbered the tests in the order of 
their correlations with the criterion. To make the arith- 
metic as easy as possible to follow in an illustration, the 
original correlation coefficients are given to two places of 
decimals only, and only three places of decimals are kept 


at each stage of the calculation. The previous explanation 


ought to enable the reader to follow. As an additional 
help, take the explanation of the value -454 in the middle of 


slab C. It is obtained thus from slab B-— 


1 X 490 — -079 x "460 = -454 


and is typical of all the others. 
of each first tow, only one kind 
through the whole 
mechanical, 


Except for the division 


becomes quite 
left in brackets 


modified first rows. 


Sea om 
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hand block is empty, when the regression coefficients 
appear in the middle block.* : 
The result is that we find that the best prediction of a 
man's probable success in this occupation is given by the 
regression equation— 
fo = -8902, + 4312, + 22225 + 0182, 
We give a candidate the four tests, reduce his scores 


COMPUTATION OF REGRESSION COEFFICIENTS 
Aitken’s Modified Method with Each Pivot converted to Unity 


Check 
(1) -39 69 49 |—1 . . i 1:57 
39 l kt) 0:27 201 ^ —1 x j “85 
A|-69 19 1 B8 ited A —1 l 1-26 
9-27 e "1 à . j —1 134 


72 68 58 c4 | - : dj g 2:34 


[221] 


a +390 -431 +222 *018 | 1-061 
E Final Regression Coefficients 


* The product of all the unconverted pivots, 1 x -848 x -517 x 
-748, is the value -328 of the determinant : 
1-00 +39 -69 49 
| -39 1-00 -19 M 
-69 -19 1-00 -88 
| -49 -27 -38 1-00 
If this alone were wanted, the middle block, and the criterion row, 
would, of course, be unnecessary. 
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to standard measure by dividing by the known ds 
deviation of each test, insert these standard scores into 
this equation, and obtain an estimated score for him x: 
the occupation. Thus the following three young me 


could be placed in their probable order of efficiency in this 
occupation from their test scores : 


Standard Scores in 


24 Žo 

Tom an 0 2 —55 :81 
Dick —4 —.g 1 BE | hee 
Harry 2 1:3 8 6 88 


The multiple correlation of 
true values would be o 
correlation coefficients — 


such estimates £) with ithe 
btained by inserting the four 


"2 63 "58 ul 


instead of the z's in the regression equation, and taking 
the square root, thus— 


0890 X 72 + -481 x -88 ...999 x «5g + -018 x 4d 


= -68847 = 7,2 


Cotes 83, 
Finally, we can, as w 


€ did in the former example, use 
the regression Weights 


; on a pooling square and see if we 
obtain this same multiple correlation of Tm = 88: 


ll 


3890 -431  .229  .91g 

D Gao SN MNA ERC 

1-00 | 72 -63 58 a | 
ELO 72. | 1-00 39 “69 “49 
431 | .63 :39 — 1-00 19 27 
222 | -58 *69 19 1-00 38 | 
018 | -41 -49 27 88 1-00 
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all the numbers in each quadrant. The easiest way of 

doing this in large pooling squares is to multiply the rows 

first, then add the columns and multiply the totals by the 

column weights, finally adding these products, thus : 
Multiply the rows : . 


:390 431 :222 :018 
1:0000 72 -63 58 41 
+2808 +3900 1521 -2691 1911 
2715 1681 4310 0819 1164 | 
1288 1532 :0422 2220 0844 | 
0074. 0088 0049 0068 0180 
Mot Ee o eid ed ea m Ee c 
Sums *6885 77201 -6302 -5798 -4093 


If we had kept all decimals these columnar sums would, 
since we are using regression coefficients as weights, have 
been exactly equal to the top row. With theáctual figures 
shown, on multiplying the column totals and adding them, 
we find that the pooling square condenses to : 


1-0000 6885 
6885 -6885 
-6885 
Ta = — z = “83 as before. 
-6885 


— There is a tendency, which com- 
r the regression coefficients of the 
tests of a battery to be in the same order of magnitude as 
their correlations with the criterion. But this is not in- 
variably the case, and in the present example, if we com- 


pare the two sets— 
correlations with criterion 72 -63 -58 “AL 
and regression coefficients +390 -431 -222 -018, 
we see that Test 2 has a higher regression coefficient than 
Test 1, although a lower criterion correlation. The reason 
lies in the high correlation of Test 1 with Test 3, -69. 
They measure to that extent the same thing, and when 


8. Using fewer tests. 
mon sense finds natural, fo: 
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Test 3 is introduced into the battery it begins to some extent 
to put Test 1’s “ nose out of joint.” 

The boxed numbers in the calculation on page 205 are 
all regression coefficients. If only Test 1 is used, its 
regression coefficient is -72. If Tests 1 and 2 are used, 
their regression coefficients are -559 and -412. If Tests 1, 
2, and 3 are used, their regression coefficients are -397, 
488, and -224. And if all four Tests are used, the four 
final numbers are the regression coefficients. 

The addition of each test raises the multiple correlation 
Tna: We have— 


T 


maz 


Test 1 72 X 72 = 5184 
Tests 1 and 2 “72 X -559 + -63 x -412 = 6622 
Tests 1,2,and 3 -72 x -397 + -63 x -433 -+ -58 x -224 = -6882 f 
Tests 1,2,3,and4 -72 x -890 + «63 X -431 -+ -58 x -222 

+ -41 x 018 = -6885 


Although the addition of each test raises the multiple 
correlation, some do so only very little; and our caution 
in ordering the tests in accordance with the magnitude of 
the criterion correlation makes it probable, though not 
certain, that the comparatively useless tests will be the later 
ones. We can at each stage of the calculation pause and see 
whether the test we have just added makes a significant 
addition to the multiple correlation. We do this by an 
analysis of variance (see Lindquist, 1940, Chapter V, or 
other text-book). Consider, for example, the rise in the 
squared multiple correlation from :6622 to «6882. Is the 
rise statistically significant ? To decide this we must know 
the number of persons tested, say N — 105. 


| ls 
Tests 2 | Degrees of | Mean | ; 
Pets | Freedom Square | Ratio F 
land 2 : . | «6622 | 5 | » 
Increment on add- | | | 
ing 8 ts - | 0260 | 1 | -0260 02 y 

1 | ^ *0260 —-0031—7-7 
Residue 4 -|2818| 101 | 0081 | 0 

"Total 


* |1-0000 N—1=104) 
| 
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The caleulation is carried out in the above form, and the 
decision whether the increment of r5, is statistically 
significant depends on the size of the ratio F. If it is large 
enough, the increase is significant. To decide how large, 
consult Table V in Fisher and Yates’s Statistical Tables, 
where we find that, with degrees of freedom 1 and 101, a 
ratio of 6:88 would be significant at the 1 per cent. point, 
*i.e. quite highly significant, and 7-7 is even larger than this. 
So the increase due to the addition of Test 3 is well worth 
while. A similar calculation for the further addition of 
Test 4, producing a rise of 0003 in 72, shows, as might be 
expected, that this is not significant, for F is now less than 
unity, and Tests 1, 2, and 8 are (with 105 cases) as good 
as the whole battery. 


| D | | 

a 2 egrees of | Mean dA 

Tests | "maz | Freedom E itn Je 
| 
1,2,and8 . | +6882 8 | | 
Increment on add- | | | 

ing 4 B . | +0003 | 1 | -0003 | Less than unity. 

Residue — . .| -3115 | 100 | -0031 | 
| | - — 
Total . . |10000 | 104 | | 


9. Calculation of a reciprocal matriz.—A somewhat longer 
method of calculating regression coefficients has two 
advantages: it permits the easy calculation of regression 
coefficients for any criterion (or many) when once the main 
part of the computation is completed, and, what is of great 
importance, it enables the standard errors of the coefficients, 
and of their differences, to be found quickly. 

The method referred to is to find first of all the reci- 
procal of the matrix of correlations of the tests. This is 
done by pivotal condensation also, as illustrated in the 
table overleaf. The matrix whose reciprocal is required 
appears in the top left-hand corner, with a diagonal array 
of minus ones on its right, and a diagonal of plus ones 
below it. The whole is condensed in the manner already 
described on page 205; and the required reciprocal matrix 
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and also that nearly half the numbers can be written down 
from symmetry. 

The regression coefficients for any criterion are then 
obtained by multiplying the rows of the reciprocal by the 
criterion correlations and then adding the columns. In 
the example of page 205 we multiply the first row of the 
reciprocal by :72, the second by -63, and so on. The 
addition of the columns then gives the same regression 
coefficients as were found on page 205. 

10. Variances and covariances of regression cocfficients.— 
The most important advantage of this method is that 
whatever the criterion, the variances and covariances of the 
regression coefficients are proportional to the cells of the 
above reciprocal matrix (Fisher, 1925, 15 and 1922, 611). 
This enables their absolute values for any given criterion 
to be obtained by multiplying by 1 — 7*5 (the defect of 
the square of the multiple correlation from unity), and 
dividing by the number of “degrees of freedom ” which 
is for full correlations N — p — 1 where N is the number 
of persons tested, and p the number of tests. For partial 
correlations the degrees of freedom are reduced by the 
number of variables ** partialled out." 

Thus in our example, where p = 4, if N had been 105, 
N — p — 1 would be 100. The multiple correlation was 
-83, and 1 — 7?,, = :812 (see page 206). The variances and 
ances of our four regression coefficients are in this 
al to the reciprocal matrix multiplied by :00312. 

0075 —:0017 —-0042 —:0016 

—:0017 :0038 «0006 + —:0004 

—.0049 0006 0061 —:0004 

—0016 —-0004 —-0004  -0042 
ors of the regression coefficients are the 


covari 
case equ 


The standard err 
square roots of the diagonal elements : 


Regression coefficients 390 431 222 -018 


Standard errors :087 -062 we 1093 
Significant ? Yes Yes ? No 


The correlations of the regression coefficients will be got 
by dividing each row and column by the square root of 


the diagonal element. We obtain : 
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1:00 — 31 —-62 —-28 

—'31 1-00 "12 —-10 

E202 :12 1:00 —-08 

— 28 — 10 —-08 1-00 
We can now calculate the standard error of the difference 
between any pair of the regression coefficients and see 
whether they differ significantly. Take, for example, those 
for Test 1 (-390) and Test 2 (481). The difference is -041. 
Its standard error is the square root of 

0075 + -0038 + 2 x -81 x -087 X :062 = -0146 
7. standard error of -041 is -121 

The difference is therefore not significant when N = 105. 
Had N been larger it might have been. 
11. The geometrical picture of regression.—Before we close 
this chapter it will be illuminating to consider what re- 
gression and estimation mean in terms of the geometrical 
picture of Chapter VI. Consider the illustration used in 
the earlier pages of the present chapter, with the matrix : 


qM. z 


æ | 1:0 dr 5 
iw cain 0) 3 
z 5 3 1:0 


Here v is the criterion, y and z are the tests. Each of 
them can be represented by a directed line, as explained in 
Chapter VI, with angles between these lines such that 
their cosines are the above correlations. The three lines 
vill then be in an ordinary space of three dimensions. 

The two tests y and z themselves have, of course, lines 
which lie in a plane: any two lines springing from the 
same Point as origin lie in a plane. The criterion line œ 
is not In this plane (say, the table top, on which we may 
to lie), but makes an angle with it. 
Ssion and multiple correlation is, in 
cal d to find the line in the plane 
Ar og es the smallest possible angle with the 
Luo P zaer Possible angle corresponds to the 

Possible correlation, Clearly this desired line is the 
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line which is the projection of the line æ on to the yz plane, 
the shadow thrown by v on the table with the sun right 
overhead. In Figure 29 it is the line OB, where B is verti- 
cally below a point 4 on the test line a. 

The regression coefficients are numbers which express 
the proportions in which the tests y and z have to be com- 
bined to give this line OB. It is just like the parallelogram 


Figure 29. 


of forces. If from B we draw parallels to the two test lines, 
we obtain OY and OZ as the distances to be measured along 
the two test lines to give a resultant along OB, which is as 
near as we can come to OA. (No combination of y and z 
can give a line out of their plane.) Zf the distance OA is 
taken as unity, the distances OY and OZ are the actual 
regression coefficients. Ifa wire model like Figure 29 were 
made with the proper angles with cosines v with y equal to 
“7, œ with z equal to +5, and y with z equal to :3, the distances 


OY and OZ would be found to be -6044 and -3187. And 
the cosine of the angle BOA would be -763, the value we 
found for the multiple, or highest possible, correlation of 
the two-test battery with a in Section 5 of this chapter, 
page 200. . NS 

12. Estimation the same as projection.—Let us now con- 
sider a man P whose two scores in the Tests y and z we 


know, and whose probable score in Test » we wish to 
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estimate. His two scores OM and ON in y and z cree 
us to assign to this man a point P on the yz plane, a poin 

so chosen that its projections on to the y and vectors 
give the scores made by him in those tests (see Figure D 
But we cannot say that this is his point in the three dimensional 
space of x, y, and z. His point in that space may be pt 
where on a line P'PP" at right angles to the plane yz. For 


p! 


Figure 30, 


from anywhere on that 1 
on the points M and 
vector æ, which gives 
depends very much on 
P'PP'. All the peopl 
have the same scores 


ine, projections on to y and z fall 
N. Yet the projection on to the 
his score in the criterion test a, 
the position of his point on the line 
€ represented by points on that line 

in y and z but different scores in @, 
and our man may be any one of them. Before deciding 
what to do in these cir 


‘cumstances, let us consider this set 
of people P'Pp" in more detail. 


It will be remembered that the whole population of 


by a spherical swarm of points, 
crowded together most closely round about the origin O, 


and the plane containing 
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spherical swarm into equal hemispheres. It follows that 
a line like P'PP" is a chord of the sphere at right angles to 
a diameter (the line OP), and consequently that it is 
peopled symmetrically on both sides of P, both upwards 
along PP’ in our figure, and downwards along PP", the 
men on the line being most crowded near the point P itself. 
'The average man of the array of men P'PP" (who are all 
alike in their scores in the two tests y and z) is therefore 
the man at P, and since we do not know exactly where 
our candidate's point is along P'PP", we take refuge in 
guessing that he is the average man of his group and is at 
the point P itself. From P, therefore, we drop a perpen- 
dicular on to the vector æ, and take the distance OL as 
representing his estimated score in that test. This geo- 
metrical procedure corresponds exactly to the calculation 
we made, as a little solid trigonometry will show the 
mathematical reader. The non-mathematical reader must 
take it on trust, but the model may illuminate the calcula- 
tion. OL is the average of all the different scores æ that a 
person with scores OM and ON can have. The estimate 
will only be certain if the line 2 itself is on the table; it 
will be less and less certain, the more the line æ is inclined 


to the table. 
It should be noted that the angles which three test 


vectors make with each other are impossible angles, if the 
determinant of the matrix of correlations becomes negative. 
Ordinarily, that determinant is positive. In our present 


example we have, for example : 


1:0 7 5 
7 1:0 3 | = 88 
5 3 1:0 | 


Such a determinant, however, though it cannot be 
negative, can be zero, namely in the cases where the two 
smaller angles exactly equal the largest. In that case the 
three vectors lie in one plane—the criterion line has 
sunk until it too lies on the table. In that case alone, 
when the determinant is zero, the “ estimation ” is certain, 
and all the people in the line P'PP" have not only the same 
scores in y and z, but also the same scores m a. The 
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vanishing of the above determinant therefore shows that 
this is so. And in more than three dimensions, although 


we can no longer make a model, the vanishing of the 
determinant : 


e Toy Toz Tos : Ton | 
To 1 Ti» Tiz E Tin | 
Tos Ti» 1 Tog Ure m A, say, 
os S ga m Tan | 

| Ton Tin Ton T3n . H 


Shows that the criterion z 


o can be exactly estimated from 
the team z, z, 


+++ 2%. In fact, the multiple correlation 


Tm Which we have already learned to calculate in another 
Way, can also be calculated as— 


where A is the whole 
left after deleting th 
expression clearl 
In our small ex 


determinant, and Ago is the minon 
e criterion row and column. This 
y becomes equal to unity when A = 0. 
ample a, y, z, we have— 


A = -38 Ago = -01 

:38 -58 
7, =f -5824 763 
i" :91 i Uu 


as we already know it to be from page 200. 
MU 18. The “centroid ” 


method and the pooling square.—The 
pooling square, which we have learned to use in this 
chapter, enables us to see in another light the nature of the 
factors first arrived at by the “ centroid ” method. 


ote 


Equal Weights 


Ry Ry Za Ry Ry 
EI 1 1 Tiz Tis Tia 
Equal E 1 1 The Tis Tia 
qua sr | 
Wa DES Tio Tiz 1 Tag Tog 
s “3. | Tig Ti Tas 1 T34 
A Ta Tia É 
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Let us suppose that the tests %, 2», Za, and 2, have the 
correlations shown, and let us by the aid of a pooling square 
find the correlation of each of them with the average of all. 
This means giving each test an equal weight in pooling it. 

'The correlation of z, with the average of all is then 
obtained from the above pooling square (see previous page), 
which condenses to : 


| 1 | Yl rc sca 


| 
| | 
| 1 | | 
eto Sum of all the cells | 
pers of the table of corre- | 
| n | lations. | 


This, however, is exactly the centroid or simple sum- 
mation process applied to a table with full communalities of 
unity. The first centroid factor obtained from such a table 
is simply for each individual the average of his four test 
scores, and the method is called the “ centroid " method, 
is the multi-dimensional name for an 


because “ centroid ” 
and see Kelley, 1985, 59). 


average (Vectors, Chapter III; 
The line in our geometrical picture, which represents the 
first centroid factor, is in the midst of the radiating lines 
which represent the tests, like the stick of a half-opened 
umbrella among the ribs. It does not, however, make 
equal angles with the test lines unless these all make 
equal angles with cach other. If several of them are 
clustered together, and the others spread more widely, 
the factor will lean nearer to the cluster. jh 

In the foregoing explanation the communalities have 
been taken as unity, and the factor axis was pictured in 
the midst of the test lines. If smaller communalities 
are used, the only difference is that a specific component 
of each test is discarded, and the first-factor axis must be 
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pictured as in the midst of the lines representing the other 
components of the tests. It can be shown that when 
communalities less than unity are used, if we bear in mind 
that the communal components of the tests are not then 
standardized, the pooling square gives the correlations 
exactly as before, if we use communalities instead of units 
in the diagonal. 


The first centroid factor is the average of the communal 
parts of the tests. 

The later factors in their turn are, in a sense, averages 
of the residues. There are, howev 
the first being that the average of the residues just as they 
stand is zero. The manner in which Thurstone circum- 
vents this has already been described in Chapter V. ; 

14. The most predictable criterion.—Often a criterion is 
also composed of parts, just as a battery of tests is. If it is 
suecess in an occupation, the journeyman may be judged 
for skill, for regularity of attendance, for his manner in 
dealing with colleagues or customers, ete. Some of these 
items will consciously or unconsciously be weighted more 
heavily than others in an adjudicator’s assessment of the 
man; and so too in the assessment of a boy's success in a 
Secondary school. If the weights are thus decided by 


er, some complications, 


employer, or by headmaster, the criterion score becomes 
again one number, the sum of the arbitrarily weighted 
parts. 4 

Hotelling, however, raised and solved the question of 


how to weight the parts of 
correlate most highly with a 
weighted in its best y; 
1947, 1948, and M. 
indeed two weighte 
metrical analogy, 
Figures 29 and 3 


a criterion so that it would 
given battery of tests, also 
ay (Hotelling, 1935a, and see Thomson 
S. Bartlett, 1948). There are, then, 
d batteries. In terms of our geo- 
the criterion is now no longer a line, as 1n 
gures 0, but a space, and the problem is to find 
a line in the criterion Space, and one in the battery space, 
Which will be as near to each other as possible, both spring- 
mg from an origin O common to both spaces. This tech- 
Su Which the reader will find illustrated by an 
"etr ametical example in Thomson (1947, 1948), would, for 
instance, enable Weights to be given to the tests in two 
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different batteries to make these agree with one another as 
much as possible. 

15. Weighting for battery reliability—A special case 
arises when the two batteries are composed of alternative 
forms of the same tests, when the correlation between the 
two batteries is the battery reliability, which can be 
enhanced by suitable weighting. 

Thomson (1940) described how to find the best weights 
for battery reliability, as a special case of Hotelling's 
** most predictable criterion," and Peel (1947) has given a 
simpler formula than Thomson’s (see page 353 in the 
Mathematical Appendix, Section 9a). If there are only 
two tests in the battery, with reliabilities 7,4, 7. and 
correlating with one another 7, then Peel’s formula gives 
as the maximum attainable reliability the largest root p of 
the equation. 

Mm — 9 Me (L—#) | 0 
| 149 (1 — p) To — u 


that is (1 — rj?) — u(ru + P22 — 214") + (Fur — Te") = 0- 
1f, for example, rj = :5, "u = 7, and Tas = -8, the quadratic 
has roots -848 and -490, and a battery reliability of -843 
is attainable by using weights proportional to either row of 
the above determinant with p = :843, taken reversed and 
with alternate signs, that is 0785 and -1431 

or :0431 and :0785 

or 1 and 1:8 approximately. 
If as a check we set out a pooling square for the two bat- 


teries it will be— 


1 418 TONNES 

RIA E = 

u- AL EQ) ^24 eee 

18 5 10 5on Seal 
Je m 5 1:0 5 
TW LUE ee 


e rows and columns by the weights 


and if we multiply th 
this reduces to— 


shown, and add together the quadrants, 
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6:04 | 5:092 
5:092 | 6:04 
giving a battery self-correlation or reliability of— 
5:092 
6-04 

When there are more than two tests, the solution of the 
above determinantal equation becomes laborious and diffi- 
cult. Green (1950) has given a transformation of the 
equation which enables an iterative process to be used in 
its solution, making it more practicable (see the Mathe- 
matical Appendix, page 353). 

Clearly the weights making a battery as reliable as 
possible will not be the same as those making it most valid 
in predicting a given criterion. There is here a conflict 
of aims, for we want a battery to be both as valid and as 
reliable as possible. It is very desirable that some reason- 
ably simple form of calculation should be devised to find 
those weights which should be given to the tests of a battery 
which, for a given criterion, would make the best com- 
promise, making reliability equal to validity and both as 
great as possible (see Thomson 1940, pages 364 to 365) 


= :843 as expected. 


CHAPTER XV 


THE ESTIMATION OF A MAN'S FACTORS 


a man’s *g."—So far, our discussion of 
estimation in Chapter XIV has had nothing immediate to 
do with factorial analysis. We are next, however, going 
to apply these principles of estimation to the problem of 
estimating a man's factors, given his test scores. As we 
have already explained in Chapter VII, there is no need to 
* estimate” factors when unity is retained in each diagonal 
cell; they can be calculated without any loss of exactness 
because they are equal in number to the tests: and even 
if we analyse out only a few of them, they can be exactly 
calculated for a man from his test scores. When we say 
we mean that the factors are known with the 
same exactness as the test scores which are our data. 
When communalities are used, however, factors are 


more numerous than the tests, and can therefore only be 
Two men with the same set of test scores 


may have different factors. All we can do is to estimate 
them, and since the test scores of the two men are the 
our estimates of their most probable factors will 
be the same. The problem does not differ essentially 
from the estimation of occupational success or of ability in 
any “criterion ” test. The loadings of a factor in each 
test give the 2) row and column of the correlation matrix. 
Let us first consider the case of a hierarchical battery of 
tests, and the estimation of g, taking for our example 
the first four tests of the Spearman battery used as illustra- 
tion in Chapter I, with these correlations : 


1. Estimating 


exactly here, 


** estimated.” 


same, 


2 Zo gs ža 
z | 1:00 72 63 -54 
zs -72 1:00 :56 48 
be | Eggo Too LOD dye e 
2 BA -48 -42 100 
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These correspond, in the analogy with the ordinary cases 
of estimation of the first chapter of this part, to the tests 
given to a candidate. In those cases, however, there was 
a real criterion whose correlations with the team of tests 
were known, and formed the % row and column of the 
matrix. Here the “criterion ” is g, and it cannot be 
measured directly ; it can only be estimated in the manner 
We are now about to describe. We have here, therefore, 
n0 row and column of experimentally measured correlations 
for the criterion % or g in the present case (Thomson, 
B.J.P. 25, 94). From the hierarchical matrix of inter- 
correlations of the tests, however, we can calculate the 
“saturation ” or “ loading ” of each test with the hypo- 
thetical g, and use these for our criterion column and row 
of correlations, These saturations are the correlation co- 
efficients which would be found between each test and a test 
of pure g with no Specific. We thus arrive at the matrix : 


zr | A OD I a 
Ry 80 72 1-00 "56 48 
Z3 ‘70 63 :56 1-00 42 
BOO E54. 48 149 o) 
and we want to 
the test scores % to zi 


ity in an occupation, and the 
We can, for example, 
Tegression coefficients, 
€ hierarchical qualities 
1 shortly See, an easier 
ating for the student 
actually to work ou ion coefficients as in an 
Own on the next page. 
eS Zi 2 2, and z, which 


à man has made in ts, we can estimate his g 


by the equation— 


ê= "558312; -+ "25952, + "16027, + "10952; 
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(1.00) 72 63 -54 |—1-00 
‘72 1:00 -56 48 . —1:00 
63 -56 1:00 42 . . —1:00 
54 48 42. 1:00 . . . —1-00 
:90 -80 ‘70 -60 . . 


(2-0764)(-4816) -1064 -0912 


1-0000 9 -1894 
“1064 -6031 -0798 -63 5 — 1-00 5 4193 
0912 -0798 7084 -54 3 D —1-00 +4190 
+1520 -1330 -1140| -90 ; . |1:2994 
(1:7253) (5796) -0596| +4709 2209 —1-00 -3311 
1:0000 +1028 -8124 +3811 —1-7253 * :5712 
.0597 -6911| -4087 1894 — . —1:00 | -3438 
*0994. -0852 -6728 +8156 å 1:1730 
(1:4599) (0850) -3552 1666 -1030 —1-00 | :3097 
1-0000| 5186 -2432 1504 —1:4599| -4521 
.0750| 5920 -2777 1715  . [11162 
5591 2595 1602 -1095|1:0823 


Regression Coefficients 


The multiple correlation of such estimates in a large 
number of cases with the true values of g will be by analogy 


with our former case given by— 
Tm? = 15081 X +90 + +2595 X :80 


+ 1602 x :70 + 1095 x :60 = :888 


Tm = 2940 


We must remember, however, that such a correlation here 
is rather a fiction. We had in the former case the possi- 


bility of comparing our estimates with 
eventual performance in the occupation 
Here we have no way of knowing 2; W 
estimates. 


the candidate's 
or criterion zo. 
e only have the 


As before, we can check the whole calculation by a 


pooling square (see page 200). 
Estimating g from a hierarchic: 

mathematically, exactly t 

any criterion, and can be 


al battery is therefore, 
he same problem as estimating 
done arithmetically in the same 


way. Because of the special nature of the hierarchical 


224 THE FACTORIAL ANALYSIS OF HUMAN ABILITY 


matrix of correlations, however, with its zero tetrad- 
differences, there is an easier way of calculating the estimate 
of g, due to Professor Spearman himself (Abilities, xviii). 
For its equivalence mathematically to the above sce 
Appendix, paragraph 10. : 
Meanwhile we shall illustrate it by an example which 
will at least show that it is equivalent in this instance. 
The calculation is best carried out in tabular form, and is 
based entirely on the saturations or loadings of the tests 


with g, which are also their correlations with g. 
| | | Regression 
d ears Tig Coefficients 
B ACE AE n, 
| jl+S I ty 
- l L— E. 
a [EST ei e rm 5533 
2 8 64. 36 | 1-778 | 2.9999 +2596 
3| oy 49 51 9608 | 13725 1603 
4 | 6 36 | -64 +5625 9375 1095 
S = 7:5643 


The result, with much less calculation 
The quantity S is of some importance in t. 
is formed in the fourth column of the ta 


, is the same. 
his formula. It 


ble, from which 
it will be seen that— 
fe Fue mie. Tu 
d UMEN SEQ MU temere 1— ry? 
It is clear that S will become larger and larger as the 
number of tests is increased. 


Now, we saw that the s 
Tm is obtained when we m 
and sum the products. 


quare of the multiple correlation 
ultiply each of the weights by 7, 
That is to say— 


Tm? = E (weight x saturation) 
1 Ti 
EON (ee a TS 
e +S l-r? * D 
1 
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multiple correlation. 

2. Estimating two factors simultaneously.—We have seen 
in the preceding section how to estimate a man's g from 
his scores in a hierarchical team of tests, and in this we 
shall consider the broader question of estimating factors in 
-general. Thus in Chapter V the four tests with corre- 
lations : 


1 2 3 4 


1 : hee ay ee 
2| 4 7038 
E A T) Ge oe 
Domo EDT 


were analysed into two common factors and four specifies 
with the loadings (see Chapter V, page 79). 


Common Factors | 


| I II | Specific Factors 
1 | -5164 . | 8568 
2 | -7746 38162 | - 54T 
3 ‘7746 3162 | . . ATT. 
4 | +3873 at vs 5 . 9220 


oadings can be used as the 


Any one column of these | 
by Aitken's method, and 


criterion row in the calculation 
the regression coefficients calculated with which to weight 
a man's test scores in order to estimate that factor for 
him. If, as is probable, we want to estimate both common 
factors, we can do the two calculations together, as shown 
on the next page. Both arrays of loadings are written 
below the matrix of intercorrelations, and then pivotal 


condensation automatically gives both sets of regression 


F.A.—8 
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coefficients, with only one extra row in each slab of the 
calculation : 


| 
(10) 4 — «4 2. |-r0 : | 
4 10 c7 38 3 —1-0 5 
4 7 L0 u | . B —1:0 : | 
$i — HDI EY CU) —1-0 | 
"5164 -7746 -7746 -3873 | 
-8162 -3162 d | i 
(84) -54 -22 40 —1-0 |10 
1:00 -6429 -2619 4762 —1-1905 b a 1:1905 
54 84 922 40 —r0 eo [r0 
22 22 -96 20 —r0 6 
5680 +5680 -2840| -5164 : à R 1:9365 
33162 -3162 . ; ; E -6324 
(4928) - . | -3571 
1:0000 - : . , "1246 
0786 -9024| -0952 -2619 3 —1:0000| -3381 
:2028 -1852|  .2459 -6762 à 5 1:2603 
11129—0828| —-1506 -3764 TNI 
(8899)  .0724 1594 1594 —1-0000 | 
1-0000| -0814 -1791 1791 —1-1237 
1029| 3871 4116 — 4116 
—1008 | —-1833 -2291 -2291 
| 1787  .8982  .3982 1156 
| —1751 — 9472 -2472 —:1138| :2000 
| Regression Coefficients 


If, therefore, we have a man’s scores (in standard 
measure) in thes 


€ four tests, our estimate of his Factor I 
will be— 


“17872, + 39322, + :3932z, + 11562, 


and estimates made in this wa 
correlation r, with the « true ” 
number of different candidates, 


y will have, a multiple 
values of the factor, in a 
given by— 
Ty? = 1787 x 5164 + -3932 X "TTA6 + -3932 X -7746 
+ 1156 x 3873 = -7462 
"Tu = -864 
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Similarly, the multiple correlation of the estimate of the 
second factor with the “ true ? values can be found to be— 
Ta = -395 

The two factors are not, therefore, estimated with equal 
accuracy by the team. As before, the whole calculation 
can be checked by a pooling square. 

We have now found the regression equations for esti- 
mating the two common factors by treating each in turn 
as a “criterion.” It is also possible to estimate a man’s 
specific factors in the same way. Indeed, we might have 
written the loadings ‘of the specific factors as four more 
rows below the common-factor loadings in the first slab 
and calculated their regression coefficients all in the one 
calculation. But it is easier to obtain the estimate of a 
man’s specific by subtraction (compare Abilities, 1932 
edition, page xviii, line 10). For example, we know that 
the second test score is made up as follows— 

za = “TAGS, + -8102f, + “54775 
where f, and f, are the man's common factors and s, his 
Specifie. We have estimated his f; and fa and we know 
his z; so we can estimate his s from this equation. 'The 
estimates of all a man's factors, to be consistent with the 
experimental data, must satisfy this equation and similar 
equations for the other tests. If the estimate of the 
specifie is actually made by a regression equation, just like 
the other factors, it will be found to satisfy this require- 
ment.* From the estimates of all a man's factors, there- 
fore, including any specifies, we can reconstruct his scores 
in the tests exactly. From only a few factors, however, 
even from all the common factors, we cannot reproduce 
the scores exactly, but only approximately. 

3. An arithmetical short cut (Ledermann, 1938a, 1939D).— 
If the number of tests is appreciably greater than the 
number of common factors, the following scheme for 
at we know the best relative loadings 


of the tests to estimate a specific by regression without needing to 
know how many common factors there are, or whether indeed any 
Specific exists or not. (Wilson, 1934. For the same fact in more 


familiar notation, see Thomson, 19364, 43.) 


* Tt is interesting to note th 
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arithmetic of the example easy. The regression estimates 
of his two common factors are— 
fi= 1787% + -3982z, + -3932z, + -11562, 
fo — 17512 + 94792, + -24722, — -113324 
Inserting his scores z = z, = 2, = z, = 1 into these 
equations we get for the regression estimates of his factors— 


f, = 1-0807 
fo = -2060 


that is, we estimate his first factor to be rather more than 
one standard deviation, his second factor to be about 
one-fifth of a standard deviation, above the average. 

Now, the specification equations which give the composi- 
tion of the four tests in terms of the factors are— 


z = -5164f, - + 8563s, 

% = “TTAGf, + -8169f, + 54775, 

% = ‘TTAG/, + -3162f, + -54778, 

z4 = -8878f, . + -9220s, 

Tf we insert the above estimates f, and di 
fy we get for this man’s scores— 


% = ‘5581 + -85634, 

Za = +9022 + -54778, 

2. :9022 + SATIS, 

% = 4180 + -92206, 
; We know his four Scores cach to have been + 1, and 
if we had also worked out the estimates of his specifics 
by the regression method we should have found that they 


added just enough to the above equations to make each 
indeed come to + 1. 


We can, therefore, find his estimated 
Specifics more easily fi 


I 


in lieu of f, and 


ll | 


E 
ll 


rom the above equations, as in this 
case— 
ee eM 
:8563 
$, = 1 — -9022 = -1786 
-5477 


and so for $, and $5 subtractin 


common factors from the know: 
case) and dividing by 


g the contribution of the 


wn score (here + 1 in each 
the specific loading. 
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The regression estimates of the factors, made by the 
system we have so far been considering, are as a matter 
of fact not the only estimates which have been proposed 
(see Section 8 later). The regression estimates are the 
best in the sense that they give the highest correlation, 
taken over a large number of men, between the estimates 
and the true values of a criterion when the latter can be 
Separately ascertained. 

The regression estimates of the factors have one other 
great advantage, that they are consistent with the ordinary 
estimation of vocational ability made without using factors 
at all, as can best be shown by means of the example of 
Section 7 of Chapter XIV. 

5. Vocational advice with and without factors.—In that 
example we had an "occupation" zo and our gti 
715 25, 23, and z4 ;.and in Chapter XIV, without using factors 
at all, we arrived at the following estimation of à man's 
success or “ score ” in the occupation (which is, after all, 


only a test like the others, though a long-drawn-out one)— 
By = -8902, + -4812, + 2222, + “018% 

Now let us suppose that the matrix of correlations of 
these five tests (including the occupation as à EE 
been analysed, by Thurstone’s method or any Bore 
common factors and specifics —the matrix is given 1n 
Chapter XIV, page 204. Indeed, the four tests proper were 
80 analysed by Dr. Alexander in the monograph from EST 
We took their correlations, and the analysis below is based 
On his. The * occupation " % is a pure fiction made o 
the purpose of this illustration, but we can easily imagine it 
also being analysed in exactly the same way as & test. 
The table of loadings of the factors, to which we may as 
Well give Dr. Alexanders names of £ (Spearman’s g), v (a 
verbal factor), and F (a practical factor), is as follows : 


z b p Specific 

SERM z -37 
Occupation Zo 3 ae E *50 
St T MET M Ds 

tanford Binet 9 71 60 
Picture completion zs BH z 54 
Reading test gate bes “BS S “67 
Geometrical analogies z4 T4 j i : 
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computing the regression coefficients will involve less 
arithmetical labour than the general formule expounded 
in Chapter XIV and applied to the factor problem in this 
chapter.* : 

For illustration, we shall use the data of the preceding 
section (page 225), although in that example the number 
of tests (four) exceeds the number of common factors (two) 
only by two, which is too small an amount to demonstrate 
fully the advantages of the present method. The common- 
factor loadings and the specifies of the four tests form a 
4 X 2 matrix and a 4 x 4 matrix respectively, thus : 


the matrix M, being identical with the first two columns, 
and the matrix M, with the last four columns of the table 
on page 225. Before the data are subjected to the com- 
putational routine process, which will again consist in the 
pivotal condensation of a certain array of numbers, some 
preliminary steps have to be taken: (i) the loadings of 
each test are divided by the square of its specific, and the 
modified values are then listed in a new 4 x 2 matrix : 


| 7042 | 
M, —| 25820 10540 | 
AT 95820 10540 
| 4556 V 
eg. 2-5820 = (-7746) — (-5477)2 


1:0540 — (3162) — (5477): 
(ii) Next, the inner 
every column of M 
calculated and arr. 


products (see footnote on page 74) of 
o in turn with every column of M, are 
anged in a 2 x 2 matrix: 


» in the form here 


For oblique fact i i in 
ER dn que factors, which are described i 


> 


: 5 are necessary in Ledermann’s formulz, 
for which see Thomson (19 


Mathematical Appendix, page 365. 
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paro. [4940 16829 
MS Jos E 2] 

If there had been r common factors the matrix J would 
have been an r x r matrix. The arithmetic is simplified 
by the fact that J is always symmetrical about its diagonal, 
so that only the entries on and above (below) the diagonal 
need be calculated. (iii) Finally, each element on the 
diagonal of J is augmented by unity, giving, in the notation 
of matrix calculus, the matrix : 


I+J= 1.6399 1666 


| 5-5401 — 1:6829 | 
5| 
I 


This matrix is now “ bordered ” below by the matrix 
M, and on the right-hand side by a block of minus ones 
and zeros in the usual way. The process of pivotal 
condensation then yields the same regression coefficients 
às were obtained on page 226. 


5 5-5401 — 1.6829 | —1-0000 : 61730 
1:0000 -2947 — 1805 1:1142 
1:6329 1:6665 . —1-0000 2:2994 
"7042 -7042 
2-5820 1:0540 3:6360 
2-5820 — 1-0540 3:6360 

1556 4556 
1-1853 .2947 —1-0000 -4800 
1:0000 -2486 —8437 -4050 
— 2075 4271 — 0804 

-2931 «4661 | 7591 

2931 -4661 “7591 

— 41848 -0822 —-0520 

l 4787 —4751 -0036 

Regression Coefficients 13932-2478 -6404. 
3932 — -2478 «6404 

1156 —-1188 -0028 


4. Reproducing the original scores.—Let us imagine a 
in our example obtains 


man who in each of the four tests 1 
à score of +1; that is, one standard deviation above the 
average. We choose this set of scores merely to make the 
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With this table of loadings in our possession we might 
have given vocational advice to a man in a roundabout 
way. Instead of inserting his scores in iy Cats and 2, in 
the equation for $, we might have estimated his factors 
g, v, and F from his scores in the four tests, and then 


inserted these estimated factors in the specification equa- 
tion of the oceupation— 


% = 55g + -450 + -60F + -375, 


(ignoring the specific So Which cannot be estimated from 
7p X» 2 and z4). Had we done so, we should have arrived 
at exactly the same numerical estimate of his zy as by the 
direct method (Thomson, 1936a, 49 and 50) 

The actual estimation of the factors s 


g, v, and F from 
the four tests will form a good arithmetical exercise for the 
student. 


The beginning and end of the calculation of the 
regression coefficients is shown here, following exactly 


the lines of the smaller example on page 226 of this chapter : 


Check 

1:00 39 69 49 | =al P : : 1:57 
339 1:00 19 -27 9. * ci : : 1:26 

:69. .19 1:00 38 | ) TRES 2 134 

49 .27 .88 1:00 | s P eal “85 

66  -37  -52 74 | ; 8 x ; 2-29 

+52 66 " : 6 . 1:18 

21 71 “92 


This reduces by pivotal condensation step by step to the 
` three sets of regression coefficients : 


for g | 800 -095 -095 -532 
for 6 853 — -158 -581 — .352 
for Ê 


*121 “TAT — 48 — -206 


The result is to give us three 


equations for estimating 
£; 2, and F from a man’s Scores i 


n the four tests, viz.— 


508002, + -0952, + -095z, + 5325. 


Eb es 0s 
| 


Now let us ass 


ume a set of scores By 
and see what th 


> Za» 23, %4 for a man, 
e estimate of his oceup: 


ational ability is by 
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the two methods, the one direct without using factors, the 
other by way of factors. Suppose his four scores are— 


2 % — L^ 
2 6 —-4 “7 


a 


The estimates of his factors g, V, and F will therefore be— 


B ="300 x 2 + -095 x :6 + :093 x (— +4) + 532 x 7 — 451 
6 —.858 x 2 — -153 x 6 + -581 x (—:4).— 852 x v = — :500 
P = -121 x 2747 X 6 — 148 X (— 74) — 206 X 7 =  :887 


If now we insert these estimates of his factors into 
the specification equation of the occupation, ignoring its 
specific, we get for our estimate of his occupational success : 

&, = 55 x -451 + -45 x (— OE +887 = :255 
that is, we estimate that he will be about a quarter of a 
standard deviation better than the average workman. 
This by the indirect method using factors. 

By the direct method, without using factors at all, we 
simply insert his test scores into the equation— 

&, = -890z, + “4812, + “222% + 01824 


and obtain— 
fa = -800 x 2 + -481 x :6 + -222 x (— 
:260 
exactly the same estimate as before—for the difference in 
the third decimal place is entirely due to “ rounding off ” 
during the calculations. The third decimal place of the 
direct calculation is more likely to be correct, since it is 
so much shorter. 
6. Why, then, use factors at all ?—The reader may Now 
ask, “ What, then, is the use of estimating a man’s factors 
at all?’ Well, in a case analogous to that of the present 
example it is quite unnecessary to use factors at all, and 


there is no doubt that a great many experimenters have 


rushed to factorial analysis with quite unjustifiable hopes 


of somehow getting more out of it than ordinary methods 
of vocational and educational advice can give without 
mentioning factors. But we must not go to the other 
extreme and “ throw out the baby with the bath-water." 
There may be other reasons for using factors, apart from 
vocational adviee. And even in giving such advice, which 


T.A,.—8* 


-4) + -018 x ^T 


ll 
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really means describing men and occupations in similar 
terms, so that we can see if they fit one another or not, it 
may be that factors have some advantages not disclosed 
by the above calculation. 

This man whom we have already used, for example, may 
be described either in terms of his scores in four fairly 
well-known tests, or in terms of the factors g, v, and F. 
By the former method his description is : 

Stanford-Binet test 

Picture-completion test 

Thorndike reading test 


Spearman's geometrical 
analogies . 


2, slightly above average 
*6, good 
—^4, distinctly below average 


6 "7, good 


etter, of not much schooling, 
ing shapes, and similarities in 
s of the occupation with these 
ost resembles the first and last 
esembles the third. We can probably 
draw the conclusion that this man will be above average 


in it; and we can draw this conclusion aceurately if we 
calculate the regression equation— 


# = 3902, + -4812, - 9995. .. 0182, 


As a description of the man, how 
suffers from the fact that 
one another. We feel a 


ever, the above table 
the four tests are correlated with 
certain clarity in the description 
are independent of one 
man whom we are at 


present consid y described, in terms of 


ering is alternativel 
factors, as : 


Factor Estimated Amount 
g "451 
v —:500 
F 


“387 
that is, a quite intelli 


gent (g) and practical (F) man with, 
however, not much ability in using and understanding 
words (v). ere is a certain air of greater generality 
about the factors than there is about the particular tests 
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from which they have been deduced, and they give 
definition and point to mental descriptions, or at least they 
seem to do so. 

Yet some of these “ advantages ” of using factors begin 
to look less bright when looked into more carefully. We 
said that one advantage is that factors are independent 
and uncorrelated. So they are, if their true values are 


known. But we only know their estimates, and these are 


correlated, as we shall illustrate shortly. If we use factors 
e advantage of 


it is clear that we must, if we value th 
__independence, seek to obtain estimates which are as little 
correlated with one another as possible. "There have been 
proposals to use factors which are really correlated ; not 
merely correlated when their estimates are taken, but 
correlated in their true measures. What advantage can 
these have over the actual correlated tests? The funda- 
mental advantage hoped for by the factorist seems to be 
that the factors (correlated or uncorrelated) may turn out 
to be comparatively few in number, and may thus replace 


a multitude of tests and innumerable occupations by a 


description in these few factors. The student whose 
tained from this book 


knowledge of the subject | is being ob 

is not yet equipped to discuss adequately the very funda- 
mental questions raised in this section, to which we-shall 
return several times in later chapters. One last point in 
favour of factors may, however, be expanded somewhat 


here. We said a couple of sentences back that factorists 
hope to give adequate descriptions of men and of occupa- 
umber of factors. 


tions in terms of a comparatively small n 

This, if achieved, would react on social problems somewhat 
in the same way as the introduction of a coinage influences 
trade previously carried on by barter. A man can ex- 
change directly five cows for so many sheep, So much 
cloth, and a new ploughshare ; put the transaction is 
facilitated if each of these articles is priced in pounds, 
shillings, and pence, or in dollars and cents, even though 
the end-result is the same. And so perhaps with the 
“ pricing ? of each man and each occupation in terms of a 


few factors. 
But the prices must be accurate; and the analyses of 
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correlations are easily calculated from the inner products 
of (b), the loadings of the estimated factors with the tests 
(page 232), with (a), the loadings of the tests with the 
factors (page 231). 1 
The matrix of loadings of the four tests with the three 
common factors is (page 231) : 
| “66 :52 :21 
d ` "UD 
M= | pe T 


:52 66 
T4 


and the matrix of the loadings of the 


hree estimated 
factors with the four tests is (page 232) : 


:300 :*095 *095 :582 
N= 8353 —-158 ‘581 —-352 

121 ‘TAT ^ —-148 —-206 
Then the matrix of vari 


ances and covariances of the 
estimated factors is— 


K —NM 
Performing the matrix multi 


plications as explained in 
Chapter X, Section 4, page 14 


5, we obtain : 


‘800 -095  .095 :532 
NM = 853 —-153  .581 —':852 


66 

| S 
321 4v —48 ...996 52-66 
^us E 


6676 -219 130 
= 718 -567 —.034 = 1K 
| 127 —-935 55 
jem 5 556 | 


If our arithmetic e whole calculation of 

‘ as accurate, the matrix K 
would have : ts di l 
rical t its diagonal. 

The actual aet its diag 


S 227 and -13 asure 
of the degr " 3 0) are a me 
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esti 7 : 
i NAM factors (that is, the squares of their standard 
Dis n and by its other elements their covariances in 
ae at 5 their overlap with one another) The 
ation of any two estimated factors is € ual to (s 
Chapter I, Figure 2)— q (see 
ue covariance (ij) 
ij en 1 
Fe ^/ variance (i) X variance (3) 
n 
of XM K we can therefore form the matrix of correlation 
e estimated factors. It is: 


| 


1:000 B53 212 
-353 1:000 —-061 
212 —-061 1:000 


219 = 4/(-676 X 7907). 


wherei : " 
in -353, for example, is 
> factors g and v are un- 


Aton thie, the tne 
Me their estimates £ and ô are correlated to an 
measiine ei The sf true ? factors £, U, and Farein standard 
only E ut their estimates ê, Ú, and ff have variances of 
be it tM and -556 instead of unity. These variances, 
asi IER E un passing, are equal also to the squares of the 
SS um between g and £, v and 9, F and Ê. 
Stel y are the estimates of the common factors 
the specif among themselves ; they are correlated with 
Strict] ASQ. that the estimates of the specifics are not 
the m species As a numerical illustration we may take 
erarchical matrix used in Section 1, pages 221 f- 


| z zy 23 A 
Z | 1-00 72 63 54 
& 1220 0:00 MGR 48 
m | 88.5. :80 OT 3s 
Za 54 -48 42 1:00 


T à r A : 
he regression estimate of g from this battery is, a5 We 


found on page 223)— 
qu Ê = 5582, + -259% + 160% 
hese e regression estimates for the four sp. 
ound, either by a full calculation like 


+ 1092, 
ecifics can also 
that of page 
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tests and occupations into factors, still more the calculation 
of quantitative estimates of these factors, are as yet very 
inaccurate, and perhaps are inherently subject to uncer- 
tainty. A fluctuating and doubtful coinage can be a 
positive hindrance to trade, and barter may be preferable 
in such circumstances. A 
We showed in Section 5 above that a direct regression 
estimate of a man’s ability in an occupation gives identically 
the same result as an estimate via the roundabout path of 
factors, so that at least when the direct regression estimate 
is possible there can be no quantitative advantage in using 


factors. When, however, is the direct regression estimate 
possible, and when is it impossible ? 


To make the direct regression estimate we require the 
complete table of correl 


ations of the tests with one another 

and with the occupation, and we have to know the candidate’s 
scores in the tests. This implies that these same tests have 
been given to a number of workers whose proficiency in the 
occupation is known, for otherwise we would not know the 
correlations of the tests with the occupation. Under these 
ideal circumstances any talk of factors is certainly unneces- 
sary so far as obtaining a quantitative estimate is concerned. 
But suppose these ideal conditions do not hold! These 
tests which we have given to the candidate have never 


been given, at any rate as a battery, to workers in the 
occupation, and their correlations w 


unknown! This situation 
In vocational advice 


But in vocational 
auge the young person's ability in 
5, and it is unlikely that just this 
we are using has been given to workers 
jobs. In that case we cannot make 


2 stimate of our candidate's probable 
proficiency Ih every oceupation, Can we, then, obtain an 
estimate in any other way ? 1 
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Other ways are conceivable, but it must at the outset 
be emphasized that they are bound to be less accurate than 
the direct estimate without factors. Although this battery 
of tests has not been given to workers in the occupation, 
perhaps other tests have, and by the aid of that other 
battery a factor analysis of the occupation has perhaps 
been made. If our tests enable the same factors to be 
estimated, we can gauge the man’s factors and thence 
indirectly his occupational proficiency. Unfortunately, 
the “if” is a rather big one. Are factors obtained by 
the analysis of different batteries of tests the same factors ; 
may they not be different even though given the same 
name? We shall discuss this very important point later, 
but meanwhile let us suppose that we have reasonable 
confidence in the identity of factors called by the same 
name by different workers with different batteries. Then 
the probable course of events would be something like this. 
An experimenter, using whatever tests he thinks practicable 
and suitable, analyses an occupation into factors. Another 
experimenter, at “a different time and place, is asked to 
give advice to a candidate for that occupation. Using 
Whatever tests he in his turn has available, he assesses 1n 
this candidate the factors which the previous experimenter's 
work leads him to think are necessary in the occupation, 
and gives his advice accordingly. The factors have played 
their part as a go-between, like a coinage. All depends on 
the confidence we have in the identity of the factors. We 
shall see later that there is only too much reason to think 
that the possibility of this confidence being misplaced be 
hardly been sufficiently realized by many over-enthusiastie 
factorists. And even if the common factors gu identical, 
there remains the danger that the * specific of the occu- 
pation may be correlated with some of the “ specifics 
of the tests, a fact which cannot be known unless the same 
tests have been given to workers in the occupation. i a 

7. Calculation of correlation between estimates —We sai 
above that even although we make our analysis of the tests 
we use into uncorrelated factors, the estimates of these 
factors will be correlated, if we use communalities and thus 
have more factors than tests. Arithmetically, these 
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226, or by the simpler method of subtraction of page 
227. Thus, to estimate $, in our present example we 
know that— 
4=9g+ V1 — 9:s, 
= 9g + -4365, 


Also we know that the estimates £ and $, will satisfy the 
same equation— 


% = -96 + -4368, 
that is— s 
e TER 
|o 486 
On inserting the expression for £ into this we get— 


§ = 1-152z,— -585% — 333z; — 22524 
and similarly— 
$, = — 7872, + 18182, — -215z, — -145z, 
EN 5422, — 2582, + 1-2492, — 1062, 
& = — 415z — 1942, — "12125 + 1-169, 

We have now both N, the matrix of loadings of the 
estimated. factors | EE $, with the four tests, and 


M, which we already know, the matrix of loadings of the 
four tests with the five factors g, s, s, Sẹ and Sa, namely : 


ll 


:9 436 a | 
8 *600 5 
M — 
wr 7 714 fin 
6 t -800 
From their product NM we obtain the matrix K of 
ariances and covariances of the estimated factors, namely : 
‘553 -259 361 -109 9 430 . A . 
1152 — -585 — -333 — -225 og BEU : 
— "87 1-313 — 215 — 345 | 7 , rae | 
— 542 — -253 1:242 — -106 6 800 | 
| — 415 — 194 — 321 1-169 = 


880-241-155 ‘115 -087 


| "A4 .502 — 3921 -238 — 180 
= | 150 — -321 
| 


= "T88 — -154 — 116 EK 
116 — -236 — 152 
"088 — - 


| 

| 
'887 — -085 
181 — 416 — *086 


:935 


- 
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Again, we have a check on the accuracy of our arith- 
metie, for K will, if we have been accurate, be exactly 
symmetrical about its principal diagonal, i.e. its diagonal 
running from north-west to south-cast. The largest dis- 
crepancy in our case is between :150 and -155. Moreover, 
since in this case K includes all the factors, we have another 
check which was not available when we calculated a K for 
common factors only : the sum of the elements in the 
principal diagonal (called the “ trace,” or in German the 
** Spur ") here must come out equal to the number of tests. 
In our case we have— 

-880 -+ -502 + :788 -r-887 + -935 = 8-992 
and there are four tests. These elements which form the 


trace of K are, it will be remembered; the variances of the 
estimates £, 8, 4 $» and $a So that we see that the total 
variances of the five factors is no greater than the total 
variance (viz. 4) of the four tests in standard measure. 
This is only another instance of the general law that we 
cannot get more out of anything than we put into it (at 


any rate, not in the long run). 
From K we can at once caleulate the correlation of the 


estimated factors. Adjusting the slight arithmetical de- 
partures from symmetry, we get: 


é E sz $5 


ê | 1:000 -362 -184 -131 -096 
Wh |) e862) c i — 90 — 8564 — 268 
$, | -184 — -510 1.000 — 1883 — 135 
é 31 — 854 —-188 1:000 — «094 
$, 


a A = .263 — -135 — 094 1:000 


correlated with each of the 
hile the latter are correlated 
in this (a hierarchical) 


from which we see that g is 
estimated specifics positively, W. 
negatively among themselves, 


example. 
We have then this result, that although we set out to 
analyse our battery of tests into independent uncorrelated 
factors, the estimates which we make of these factors are 
and instead of being in 


correlated with one another, 
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8. Factors possessed by each person and by each par 
Burt then goes on to * estimate," by "ac d E 
tions,” the amount of the factors Y possessed i e 
persons, and the amount of the factors Ji. possessed y Ed 
tests. There is a misuse of terms here, for with th ^: 
factors there is no need to “ estimate ” ; _ they M at 
accurately caleulated : but that is a goen point. T E 
three equations can be solved for the Y's—there is n e 
one equation too many, but it is consistent. And 5 e fad 
equations of the second group can be solved for t ay x 
again they are consistent. Since the equations are ii 
sistent, we can choose the easiest pair in each case to so 


for the two unknowns. Choosing the two equations for 
4, and 2, we obtain— 


1 
WS TI A 7e 
OU. + da 
I TONS 


For the other set of factors we naturally choose the 
equations in @ and c, and have — 


a 
d prs 
c 
mes 


Now, since we are 
cussion, let us rem s 
these factors f are. The factors y are factors into which 
each test has been an 
from test to test, bu 
them. They vary in amount from person to person. 

The factors f are factors into which each person has been 
analysed. These do 


- Each person is differently 
loaded with them, that i » made up of them in different 
Proportions. rrelated fictitious tests ; the 
"s are uncorrelated fictitious p 


RELATION OF PERSONAL TO TEST FACTORS 267 


Now, from the equations— 


N= ayy 
® + 32 
ie V6 


we can. find the amount of each factor y; and vy; possessed 
by each person, by inserting his scores æ and a, in these 
equations, scores which are given in the matrix : 


a b c d 


1 qp 2 0 4 
2 3 1—1 —3 
3 8 —8 1 —1 


Thus the first person possesses yı in an amount 
. — 6/24/14, because his x is — 6. For the four persons 
and the two factors we find the amounts of these factors 


possessed by each person to be: 


Factors V1 Ya 
3 
= 0 
á NT! 
b E I 
M14 V6 
c 0 i 
4/6 
2 1 
d ee 
M14 — 6 


4. Reciprocity of loadings and factors.—These are the 
amounts of the factors y possessed by the four persons. If 
now the reader will compare them with the loadings of 
the factors f in the second set of equations on page 265, 
he will see a resemblance. The signs are the same, and 
the zeros are in the same places. Moreover, the resemblance 
becomes identity if we destandardize the factors f; and fa 
measuring the former in units 4/84 times as large, and the 
latter in units 4/12 times as large, 84 and 12 being the 
non-zero latent roots of both matrices. In these units let us 
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standard measure have variances, and therefore standard 
deviations, less than unity. We could, of course, make 
them unity by dividing all our estimates by their calculated 
standard deviation. But that would make no change in 
their correlations. : 

The cause of all this is the excess of factors over tests, 
and consequently this drawback—the correlation of the 
estimates—depends upon the ratio of the number of factors 
to the number of tests. The extra factors are the common 
factors, for there is a specific to each test, and therefore 
with the same number of common factors the correlation 
between the estimates will decrease as the number of tests 
in the battery increases. Just as in the hierarchical case 
one of the tasks of the experimenter is to find tests to add 
to the number in his battery without destroying its hier- 
archical nature, so in the case of à battery which can be 
reduced to rank 2,3,4...0r 7, a task will be to add 
tests to the battery which with suitable communalities will 
leave the rank unchanged and the pre-existing com- 
munalities unaltered, in order that the common factors 


may be the more accurately estimated, and the estimates 
be more nearly uncorrelated. 
8. Bartlett’s method 


gression method used above, but by 
imizes the sum of the squares of a 
already, however, maximized by 
few common factors as possible). 

4 1 Bartlett’s estimates differ from 
regression estimates of f. 


geometrical picture already used 
When the factors outnumber the tests, 
the vectors represen re in a space of higher 


facto Th H al in the complete 
KO SUES WONG THE aes resentative point Q ma. 
be, for all we know, a P point Q M 
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perpendicular to the test space and intersects with it at 
P. In these circumstances the regression method takes 
refuge in the assumption that this individual is average 
in all qualities of which we know nothing; that is, in 
all qualities orthogonal to our test space. It therefore 
assumes P to be his point also in the factor space, and 
projects P on to the factor axes to get the estimates of his 
factors. 

Bartlett’s method is equivalent to a different assumption 
about the position of the point Q. Within the complete 
factor space there is a subspace which contains the common 
factors. Of all the positions open to the point Q, Bartlett's 
method chooses that one which is nearest to the common- 
factor space, and from thence projects on to the common- 
factor vectors. This is equivalent to making the assump- 
tion that this man is notaverage in the qualities about which 
we know nothing, but instead possesses in those unknown 
qualities just those degrees of excellence which bring his 
representative point to the chosen point Q. Because men 
are most frequently near the average, the regression assump- 
tion is more likely. 

9. The geometrical interpretation 


All this can be most clearly seen 
diagram can be made) in the case of estimating one genera 


factor g only, the hierarchical case. A figure like Figure 30 

will illustrate this case, if we take y and z there to be two 

tests and æ to be the g vector (see page 214). i 
The man's representative point in the yz plane is P. 


But we do not know his representative point Q in solid 
three-dimensional space, only that it is somewhere on the 
thod assumes that it is 


line P'PP'. The regression me 
cts P itself on to the g 


actually at P, the average, and proje à 
- line to get the estimate OL of g. Bartlett’s method, on the 


other hand, assumes that Q is at that point on P'PP" where 
it most nearly approaches the g line, that is, somewhere 
near the position Q in our diagram. Bartlett’s estimate of 
g is then represented by OL’. A \ 

Now, any point on the line P'PP”, when projected on to 
the test vectors y and 3, gives the same two test scores. 
There is, in general, no point on the line g which does this 


of Bartleit’s method.— 
(because à perspective 
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exactly. But clearly D of au peters oo g n p pi 
point whose projections most nearly 2 1 m d 
X" is as near as possible to the line P'PP". That is, : 
projection of X' on to the plane of the tests falls as he 
to the point P as is possible. In other words, if we d: 
the specifies entirely and use only the estimated g in Ac 
specification of y and 2, Bartlett's estimate comes as d 
as is possible to giving us back the full scores OM and 0 i 
If the regression estimate OL is projected on to the lines 
y and s, it will obviously give a worse approximation. | 

The regression method, in order to recover as much as 
possible of the Original scores, would have to make a 
second estimate of them. For the estimates of g repre- 
sented by quantities like OT, are not in standard Teas 
Before projecting the point Z on to the lines y and 8, 
therefore, to recover the original scores as far as possible, 
the regression method would alter the scale of its space 
along the § vector until the quantities like OL were in 


his would not only change the pos. 
; it would change the angles which 
the lines in the figure make with one another; and would 

Such a manner that, in the nexo space, 
n lo y and z would fall exactly where 
ws from L fall in the present space 
(Thomson, 1938a). 


There is, therefore, no fin 
between the two methods in 
original scores as fully as p 
method takes two bites at the 
the regression estimates c 
fication equation 


al difference in excellence 
the matter of restoring the 
ossible, but the regression 
cherry. On the other hand, 
an be put straight into the speci- 
of an occupation which is known to 


heir estimate of g when 
For the man is not 

very likely to have. ; this new test, either 

the average va] ‘ d by the regression 

method, or the speci Ssumed by the Bartlett 

method. But he ; i 

the latter, so th 


a new test is ad 
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than do the regression: estimates as the battery grows. 
Ultimately, when the number of tests becomes infinite, the 
two forms of estimate will agree. 

In the case of estimates of one general factor g from a 
hierarchical battery, the Bartlett estimates differ from the 
regression estimates only in scale. They put the candidates 
in the same order of merit for g as do the regression esti- 
mates, but give them a greater scatter, making the high 
g's higher and the low g's lower. The formula is— 


ASP- - 
pub — 


instead of Spearman's— 


DA — page 224). 
iussi ge 224) 
With more than one common fact 
between the two kinds of estimate is not so simple (Appen- 
dix, Section 13). The mathematical reader will be able to 
caleulate the Bartlett factor estimates from the matrix 
formule given in the Appendix. . 

10. Estimation of oblique factors.—1n applying the 
method of Section 2 to oblique factors, it is important to 
note that we must use, below the matrix of correlations of 
the tests, in a calculation like that on page 226, the matrix 
of correlations of the primary factors with the tests. 
These are the elements of the structure on the primary 
factors, F(A^)7! D, transposed so that columns become rows 
and vice versa. It would not do to use the structure on ue 
reference vectors, which is all that most experimenters 
content themselves with calculating. ; 

Ledermann's short cut (Section 3 above) requires con- 
siderable modification in the case of oblique Ban x 
Thomson (1949) and the later part of Section 19 of the 


Mathematical Appendix, pase 865. 


or, the connexion 


PART V 
CORRELATIONS BETWEEN PERSONS 


. 
CHAPTER XVI 


e 
REVERSING THE ROLES* 


! B 
1. Exchanging the rôles of persons and tests—In all the 
previous chaptexs the corrélations considered have been 
correlations between tests, anf? the experiments envisaged 
were experiments in which comparatively few tests were 
administered to a large number of persons. For each test 
there would, therefore, be a long list of marks. The whole 
set of marks would make an oblong matrix, with a few 
rows for the tests, and a very large number of columns for 
the persons—we will choose that way of writing it, of the 


two possibilities. 

From such a set of marks we thi 
correlation coefficients for each pair of tests, and our 
analysis of the tests into factors was based upon these. 
In the process of calculating a correlation coefficient we do 
such things to the row of marks in each test as finding its 
average, and finding its standard deviation. We quite 
naturally assume that we can legitimately carry out these 
operations. We assume, that is, that in the row of marks 
for one test these marks are comparable magnitudes which 
at any rate rise and fall with some mental quality even 


if they do not strictly speaking measure it in units, like 


feet or ounces. l 

The question we are going to ask in this part of this 
book is whether, in the above procedure, the réles of persons 
and of tests can. be exchanged (hone 1985), "5, 
Equation 17), and if so what light this throws upon 
factorial analysis. Instead of comparatively few tests 


* The first explicit reference to correlations between persons in 
éonnexion with factor technique Seem to have been made inde- 
pendently and almost simultaneously by ‘Thomson (19355, July) and 
Stephenson (1935, August), the former being pessimistic, the latter 
optimistic, But such correlations had actually been used much 
earlier by Burt and by Thomson, and almos 


+ certainly by others. 
See Burt and Davies, Journ. Baper: Pedag., 1912, 1, 251. 
249 


en calculated the 
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(perhaps two or three dozen) and a very large number vi 
persons, suppose we have comparatively few persons, an 

a large number of tests, and find the correlations between 
the persons. In that case our matnix of marks would be 
oblong in the other direction, with a lagge number of 
tows for the tests, and a small number of columns for 
the persons, and each correlation, instead ^ of being as 
before between two TOWS, would be between two columns. 
Taking only small numbers fof purposes of an explanatory 


table, we would have in th ordinary kind of correlations 
a table of marks like this è 


Persons 
x x x x x 
Tests ' x x x x x 


XXX 
XXX 


x x x x x 


while for correlations between persons we would have a 
table of marks like this B 


Persons 
X X x 
x x x 
x x x 
Tests x x x 
x x x 
x x xe 
x x x 


Scoring system which is whol] i i st 
(Thomson, 1935b, 75-6). AUD x ERN. 
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To make this difficulty more obvious, let us' suppose 
that the first four tests are : 
1. A form-board test ; 
2. A dotting test 5’ 
8. An absurdities test ; 


4. An analogies test. 
In each of these the experimenter has devised some 


kind of scoring system. Perhaps in the form-board test 
he gives a maximum of 20 points, and in the dotting test 
the score may be the number of dots made in half a minute. 
But to find the average of such different things as this is 
palpably absurd, and the whole operation can be entirely 
altered by an arbitrary changé like taking the number of 
seconds to solve the form board instead of giving points. 
2. Ranking pictures, essays, or moods.—This is a very 
fundamental difficulty which will probably make correla- 
tions between persons in the general case impossible to 
calculate. In certain situations, however, it does not arise, 
namely where cach person can put the “tests” in an 
order of preference according to some criterion or judg- 
ment (Stephenson, 1935), and it is with cases of this kind 
that we shall deal in the first place. Usually the “ tests 5 
here are not really different tests like those named above, 
but are perhaps a number of children's essays which have 
to be placed in order of merit, or a number of pictures in 
order of «esthetic preference, or a number of moods which 
the subject has to number, indicating the frequency of 
their occurrence in himself. Indeed, the subject might not 
only give an order of preference to, say, the essays, but 
might give them actual marks, and there would be no 
absurdity in averaging the column of such marks, or in 
correlating two such columns, made by different persons. 
Such a correlation coefficient would show the degree of 
resemblance between the two lists of marks given to the 
children, or given to a sct of pictures according to their 
zsthetie value. It would indicate, therefore, a resemblance 
between the minds of the two persons who marked the 
essays or judged the pictures. A matrix of correlations 
between several such persons might look exactly like the 
matrices of correlations between tests, and could be 
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analysed in any of the same ways. What would the 
* factors ” which resulted from such an analysis mean when 
othe correlations were between persons? Take an imagin- 
ary hierarchical case first. » 

8. The two sets of equations.—In test analysis the common 
factor found was taken to be something called into play 
by each test, the different tests being differently loaded 
with it. The test was represented by an equation such 
as— 

% = 6g + -8s, 


For each of the numérous persons who formed the sub- 
jects of the testing, an estimate Was made of his g, and 
another estimate could be mad@ of his $,. The different 
tests were combined into a weighted. battery for this 
purpose of estimating a man’s amount of g. His score in 
Test 4 would then be made up of his g and s, inserted in 
the above specification equation. 


Bag = 168, + :8544, 
would be the score of the ninth person in Test 4. 

By analogy, when we analyse a matrix consisting of 
correlations between persons, we arrive at a set of equations 
describing the persons in terms of common and specific 
factors. Corresponding to a hierarchical battery of tests, 
we could conceivably have a hierarchic 
from which we would exclude 
one already included. Each person in the hierarchical 
team would then be made up of a factor he shared with 
everyone else in the team, and a specific factor which was 
his own idiosyncrasy. An equation like— 


al team of persons, 
any person too similar to 


p = -4g" + -0175,' 
would now specify the com 
8’ is something all the persons have, s,’ 
Person 9. The loadings now describe the person, and the 


amount of g’ “ possessed ” or demanded by each test can 


be estimated by exactly the same techniques employed in 
Chapter XV. "The score which Test 4 would elicit from 
Person 9 would be obtai 


ned by inserting the g and s 


position of the ninth person. 
is peeuliar to 
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« 
possessed ? by that test into the specification equation 
of Person 9, giving— 
= Ag, + 9175-41 


ga 
th the former equation— 


This equation is to be compared wi 
£g = “68 + 854-9 
Both equations ultimately describe the same score, but 


ae is not identical with 24-9- The raw score X is the same, 
but the one standardized z is measured from à different 
the other. Disregarding 


zero, and in different units, from 
at with the exchange of 


this for the moment, we see th 
rôles of tests and persons, the loadings and the factors have 


also changed rôles. F ormerly, persons possessed different 
amounts of g, and tests were differently loaded with it. 
Now, tests possess different amounts of g’, and persons are 
differently loaded with it. We feel impelled to inquire 
further into the relationships of these complementary 


factors and loadings. 

The test which is most highly saturated with g is that 
one which, in terms of Spearman’s imagery, requires most 
expenditure of general mental energy, and is least depen- 
dent upon specific neural engines. It correlates more 
with its fellow-members of the hierarchical battery than 
any other test among them does. It represents best what 


ìs common to them all. 

_The man, in a hierarchical 
highly saturated with g' is that man who is most like all 
the others. His correlations with them are higher than 1s 
the case for any other man in the team. He is the indi- 
vidual who best represents the type: But a nearer ap- 
proach to the type can be made by & weighted team of men, 
just as formerly we weighted a battery of tests to estimate 


their common factor. 

4. Weighting examiners li 
lations of this kind between person 
any idea of what Stephenson has called 
analysis ” was present. The author an 


in the winter of 1924-5 a number of corre | 
ho marked the essays writte 


experienced teachers W 
fifty schoolboys upon « Ships ” (Thomson and B 


team of men, who is most 


ke a Spearman battery.—Corre- 
s were used long before 
“inverted factorial 
da colleague found 
lations between 
n by 
ailes, 
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1926). One table or matrix of such correlations between 
the class teacher and six experienced head masters who 
marked the essays independently of one another, was as 
follows : 


Te A B (6) D E P 


Te o *60 *69 -56 -69 -63 UT 
A “60 -53 -50 IDA 55 68 
B | -69 “53 i -60 “65 “66 64 
C 56 -50 -60 5 “67 67 65 
D | -69 “BA *65 67 54. “69 
E 63 55 *66 "07 54 H “69 
F -67 -68 -64 “65 69 69 


In the article in question, these different markers were 
compared by correlating each with the pool of all the rest. 
These correlations are shown in the first row of the table 
below. 

Purely as an illustrative example, let us make also an 
approximate analysis of this matrix, and take out at any 
rate its chief common factor. On the assumption that it 
is roughly hierarchical, we can use Spearman’s formula— 


; A? — 4’)* 
Saturation -HE 


More easily we can insert its largest co 
as an approximate communality for each test, and find 
Thurstone’s approximate first-factor loadings (see Chapter 


V, page 70). We get for the saturations or loadings the 
second and third rows of this table : 


rrelation coefficient 


Pee BR TC! Den a 
Correlation with pool of rest | UT 67 -76 73 76 75 -82 
Spearman saturations “814-704 -796 -766 -798 -788 -861 
Thurstone method '81 «78 .80 -78 .g9 -80 -85 

We see that F is the Most “ typical ? examiner of these 
essays, in the sense that he is more highly saturated with 
what is common to all of them; while A conforms least 
to the herd. 


* See Chapter III, page 43. 
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an essay's g’ from its examiner scores. That is to say, the 
marks given by the different examiners would be weighted 
in proportion to the quantities— 

Saturation with g’ 


1 — saturation? 


where g' is that quality of an essay which makes a common 
appeal to all these examiners. ‘Their marks (after being 
standardized) would therefore be weighted in the propor- 
tions -814/(1 — -814*), etc., that is: 


c D E F 


Te A B 
2-4] 1-40 217 1:85 2.20 2-08 3:33 
or 72-42 05  -50 66 -68 1:00 


to make global marks for the essays, which could then be 
redueed to any convenient scale. If this were done, the 
result would be the “ best ” estimate* of that aspect or 
set of aspects of the essay which all these examiners are 
taking into account, disregarding all that can possibly 
be regarded as idiosynerasies of individual examiners. 
Whether we think it the best estimate in other senses is & 
matter of subjective opinion. We may wish the “ idiosyn- 
crasies ” (the specific, that is) of a certain examiner to be 
given great weight. It clearly would not do, for example, 


to exclude Examiner A from the above team merely because 


he is the most different from the common opinion of the 
edge of the men and the 


team, without some further knowl 
purpose of the examination. The * different " member in 
a team might, for example, be the only artist on a com- 
mittee judging pictures, or the only Democrat in a court 
judging legal issues, or the only woman on à jury trying 
an accused girl. But in non-controversial matters, if all 
are of about equal experience, it is probable that this 
system of weighting, restricting itself to what is certainly 
common to all, wil be most generally acceptable as 
fairest. 

* Best whether we adopt the regression principle or Bartlett's. 
imated, the difference is 


For if only one * common factor " is esti i 
one of unit only, and the weighting in the text is the “ best " on 


both systems. 
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use g and o, for them. The equations on page 265 giving 
the analysis of the persons then become— 


= 3 
C= ae Veit) EE 

2 1 2 

b= MA 8 +292 via = san + Gam 
1 

d - V5 (VR) = Sera 
2 1 

a= ep C — 2 vig = Dum veh 


It will be seen that the loadings of v, and o, are identical 
with the amounts of Yı and y, in the table on page 267. 
A similar calculation could be made comparing the amounts 
of f, and f, possessed by the tests with the loadings of Yı 
and y, (suitably destandardized) in the analysis of the 
tests. As we said at the outset, if suitable units are chosen 
for the marks and the factors, the loadings of the personal 
equations are the factors of the test equations, and the 
factors of the personal equations are the loadings of the 
test equations. But only for doubly centred matrices of 
marks. It would be wrong to conclude in general that 
loadings and factors are reciprocal in persons and tests. : 

Indeed, even for doubly centred matrices of marks, this 
simple reciprocity holds only for the analysis of the 
covariances and not for analyses of the matrices of corre- 
lations. Except by pure accident (and as it happens, 


Burt’s example is in the case of test correlations such an 
accident), the satur 


analysis. 

5. Special features of 
any case, a matrix of m; 
ways is one in which o 
association between the 
we commonly call the 


a doubly centred matrig, But in 
arks which has been centred both 
nly a very special kind of residual 
variables is present. Most of what 
association or resemblance between 
either tests or persons, the amount of which we gauge by 
the correlation coefficient, is due to something over and 
above this. We can write down an infinity of possibly raw 
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matrices from which Burt’s doubly centred matrix might 
have come. To the rows of the latter matrix we can add 
any quantities we like without in the slightest altering the 
correlations between the tests, but making enormous 
changes in the correlations between the persons. Let us, 
for example, add 10 to the top row, 13 to the middle row, 
and 16 to the bottom row. There results the matrix : 


a b @ d 


1 4 12 10 14 
2 16 14 12 10 (A) 
8 19 13 17 15 


This gives as correlations between the persons : 


a b c d 


1:00 S5 84 — 14 
75 100 -28 — -76 
84 +28 1:00 42 

—14 —-76 -42 1:00 


aS SRA 


Next, without changing this matrix of correlations 
between persons in the slightest, we can add any quantities 
we like to the columns of the matrix of marks, and produce 
an infinity of different matrices of correlations between 
tests. If, for example, we add 5, 2, 8, and 9 to the four 
columns, we have a matrix of raw marks : 


a b c d 


1| 9 14 18 28 
ort als 2 S200 E19 NIB) 
Gl) yy Loa i 


This has the same correlations between persons, but the 
correlations between tests are now : 


mt 2 3 
1 100 —-16 -24 
2 |— 16 1:00 -92 
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5. Example from “The Marks of uates: khi 
form of weighting examiners’ marks has probably nev F 
yet been used in practice. But it has been employed, by 
Cyril Burt, in an inquiry into the marks given by examiners 
(Burt, 1936). As an example, we take the marks ae 
independently by six examiners to the answer papers 0 
fifteen candidates aged about 16, in an examination in 
Latin. (The example is somewhat unusual, inasmuch as 
these candidates were a specially selected lot who had all 
been adjudged equal by a previous examiner, but it will 
Serve as an illustration if the reader will disregard that 
fact.) The marks were (op. cit., 20) : 


Cand.| A B C D E F Examiners 
1 39 43 52 37 43 40 
2 39 44. 50 43 43 46 
3 Ad 51 5 
4 37 46 43 
5 38 AT 55 35 43 45 
6 45 50 54 
ff 42 52 51 45 44. 46 
8 48 49 53 4T 46 46 

9 32 42 49 934. 36 38 
10 37 40 48 37 39 42 
11 38 42 AT 39 36 39 
12 40 44, 50 41 36 42 
18 38 43 50 86 34. Al 
14 85 45 49 37 40 40 
15 32 38 4l 28 34. 934 


The correlations b 
this table are (the e 
tion leading) : 


etween the examiners calculated from 
xaminer with the highest total correla- 


| F A B E 


D [o] 
F : 86 84 ego, 84 71 
A | 86 A S0 Ea ge Erat 
B | 84 80 i *80 “81 “67 
E| 82 “TA “80 : 72 -69 
D| 84 85 81 72 " -48 
C | 7" 71 67 69 -48 `; 


If, assuming this table to 


i be hierarchical, we find each 
examiner’s saturation with t 


he common factor by Spear- 
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man’s formula, we obtain (with Professor Burt, op. cit. 
294) : 
F A B E D C 
My a a ey D 


In the sense, therefore, of being most typical, F is here 
the best examiner. The proportionate weights to be given 
to each examiner, in making up that global mark for the 
candidate which will best agree with the common factor of 
the team of examiners, are, as before— 


Saturation 
1 — saturation? 


ave first been standardized. The 


provided the marks h 
F the weight unity, are : 


resulting weights, giving 
F A B E D C 
1-00 “61 +54 T :29 15 

to the raw or unstan- 


(If the weights are to be applied 
be divided by that 


dardized marks, they must each 

examiner’s standard deviation.) 

H The marks thus obtained are only an estimate of the 
true " common-factor mark for each child, just as was 

the case in estimating Spearman’s g ; and the correlation 

of these estimates with the “ true ? (but otherwise undis- 

coverable) niark will be, as there (Chapter XV, page 224)— 


d - ds 
m — 41 4-8 


where S is the sum of all the six quantities— 


Saturation? 
1 — saturation? 
In our case this gives— 
EET 
marking itself correlated with the 
hypothetical “ true » mark to the amount -95, so that 
the improvement is not worth the trouble of weighting, 
ple average of the team of examiners 


especially as the sim 
gives -97. But in some circumstances the additional 


F.A.—9 


The best examiner's 
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Or instead, by adding suitable numbers to the columns 
and to the rows, we might have arrived at the matrix : 


a b c d 


or equally well at : 


"» Jj mg. 45 mu S 
2 | 84 84 26 26 (D) 
8 | 84 80 28 28 


The order of merit of the persons in each test is quite 
different in each of these matrices. ‘The order of difficulty 
of the tests for each person is quite different in each. If 
we consider the ordinary correlation between Tests 1 and 2, 
we find that it is negative in (B), zero in (D) and positive 
in (C), yet all of these matrices reduce to Burt's matrix 
when centred both ways. It is clear that they contain 


factors of correlation which are absent in the doubly 
centred matrix. 


The averages of the rows and the columns of (C) are as 
follows : 
| S b c d | Average 
E ARI CAE CHE sosta LOTES 
2 68 5T o ?7 1g | 40 
3 58 48 24 10 
| 


Average| 55 51 28 n 


The correlation between two tests is 
very much by the fact that here the pe 
cleverer than the person d. Similarl 
between two persons is influence 
is more difficult than Test 2. 
centred both ways, all the cor 
similar influences is almost exti 
(C) becomes : 


clearly influenced 
rSOn a is so much 
y. the correlation 
ed by the fact that Test 1 
As soon as the matrix is 
relation due to these and 
nguished. Centred by rows, 
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14 18 —12 — 20 
23 17 —18 —27 
23 13 —11 —25 


and all the tests are equally difficult on the average. 
Centred by columns as well, it becomes : 3 1 


— 6 2 0 4 

3 1a 38 

Go =g i i 
and not only are all the tests equally difficult on the average, 
but all the persons are equally clever on the average. It 
is to the covariances still remaining that Burt’s theorem 
about the reciprocity of factors and loadings applies. It 
does not apply to the full covariances of the matrix centred 
only one way, in the manner usually meant when we speak 
of covariances or of correlations. 

6. An actual experiment.—In Part III of Burt's The 
Factors of the Mind (London, 1940) his principle of reci- 
procity of tests and persons is seen in an actual illustrative 
experiment on the distribution of temperamental types. 

This experiment was on twelve women students, 
selected because the temperamental assessments made by 
various judges on them were more unanimous than in the 
case of the other students. Each, therefore, was a well- 
marked temperamental type. They were assessed for the 
eleven traits seen in the table below. The assessments 
over each trait were standardized, i.e. measured in such 
units and from such an origin that their sum was zero and 
the sum of their squares twelve, the number of persons, 
so that the group was (artificially) made equal in an 
average of sociability, sex, etc. The correlations between 
the traits were then calculated and centroid factors taken 
out, the first two of which I shall call by the Roman letters 
u and v. These two are possessed in some amount by 
each of the persons and required, in degrees indicated by 
the saturation coefficients, by each of the traits. "These 
saturation coefficients have been found by analysis of the 


correlations between the traits. 
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labour might be worth while, and there is an interest in 
knowing which examiners conform least and which most 
to the team, and having a measure of this. 

After the saturation of each examiner with the hypothet- 
ical common factor has been found, the correlations due 
to that factor can be removed from the table exactly as 
in analysing tests. The residues, as there, may show 
the presence of other factors; and * specific " resem- 
blances or antagonisms between pairs of examiners, or 
minor factors running through groups of examiners, may 
be detected and estimated. á 

In short, all the methods used on correlations between 
tests may be employed on correlations between examiners. 
The tests have come alive and are called examiners, that 
is all. But since the child’s performance, judged by 
the different examiners differently, is here nevertheless 
the same identical performance, our interpretation of the 
results is different. The two cases throw light on one 
another. A Spearman hierarchical battery of tests may 
estimate each child’s general intelligence, which is there 
something in common among the tests. The examiners 
may have been instructed to mark exclusively for what 
they think is general intelligence. In that case their 
weighted team will estimate for each child a’ general 
intelligence, which is something in common among the 
somewhat discrepant ideas the examiners hold on this 
matter. 

6. Preferences for school subjects. 
tions we have discussed 
who all mark the same ex 


—In the previous sec- 
correlations between examiners 
amination papers. The purpose 
of their marking these papers is to award prizes, distinc- 
tions, passes, and failures to the candidates. The exam- 
iners are a means to this end; the reason for employing 
several of them is to obtain a list of successes and failures 
in which we can have greater confidence. The technique 


described is one which enables us to combine their marks, 


on certain assumptions, to greatest advantage. But it 


ribed in The Marks of Examiners, 


ividual examiners, and to evaluate 
mining. 
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It is only a step to another, very similar, experiment in 
which objects evaluated by the “ examiners ” are not the 
works of candidates in an examination, but are objects 
chosen for the express purpose of gaining an insight into 
the minds of those asked to judge them. Thus we might 
ask several persons each to evaluate on some scale the 
zsthetic appeal of forty or fifty works of art (Stephenson, 
1936b, 353), or ask a number of school pupils each to place 
in order of interest a list of school subjects. 

Stephenson (19364) asked forty boys and forty girls 
attending a higher school in Surrey, England, thus to 
place in order of their preference twelve school subjects 
represented by sixty examination papers, and calculated 
for about half these pupils the correlation coefficients 
between them. To explain the kind of outcome that may 
be expected from such an experiment it will be sufficient 
for us to quote his data for a smaller number of pupils, 
say eight girls, avoiding anomalous cases for simplicity in 
a first consideration. The correlations between them were 
as follows (op. cit., 50) : 


Girl | 3 4 5 7 
3 o :59 “31 .26 —-02 —16 —388 — +35 
4 -59 . 75 49, S28 0L 6e 03 
5 -81 75 . 65 —29 —02 —18 — 08 
7 :26 42 65 5 —50 —-15 —54 —'1T 
17| —02 —28 — 29 —:50 . 60 52 2 
18 | —16 —-01 —:02 —15 -60 . 09 ‘79 
19 | —38 —66 —18 — 54 +52 +09 . 40 


20 | —-35 —:03 —:08 —17 72 


This table at once suggests that these girls fall into two 
types. Girls 3, 4, 5, and 7 correlate positively among 
themselves; they have somewhat similar preferences 
among school subjects. Girls 17, 18, 19, and 20 correlate 
positively among themselves. But the two groups correlate 
negatively with one another. The two types were different 
in their order of preference, Type I tending, for example, 
to put English and French higher, and Physics and 
Chemistry lower, than Type I (though both were agreed 
that Latin was about the least lovable of their studies!). 


272 THE FACTORIAL ANALYSIS OF HUMAN ABILITY 


Now according to the reciprocity principle, if we analyse 
instead the correlations between the persons, find factors 
which we may indicate by Greek letters, and measure the 
amounts of these possessed by the eleven traits, these 
amounts ought to be the same as the saturation coefficients 
of the Roman factors u, v, ete. 

Burt therefore further standardizes the assessments, 
by persons this time, and finds the total scores on each 
trait, which are, by a property of centroid factors (see 
page 217) proportional to the amounts of a centroid Greek 
factor possessed by the eleven traits; and the test of the 
reciprocity hypothesis is to see whether these totals are 
similar to the saturations of a Roman factor. "The figures 
(from Burt's page 405) are given in the table below : 


Saturations of the | Amounts of the 

Roman factors Greek factor 

u v a 
Sociability 671 -508 587 
Sex . : : : 3 ‘878 218 "489 
Assertiveness, 3 4 +827 488 “378 
Joy. 7 à : : “951 233 297 
Anger , E ; -824 3241 :280 
Curiosity . i à k 780 | — -268 -001 
n E d à : 898 | — 3159 — :089 
Sorrow , : : * :259 | — 3104 — 887 
Tenderness. | M eae 564 | — -667 — 447 
Disgust | i ; È : 880 | — .490 — :489 
Submissiveness , 5 3 419 — +685 — 525 


PART VI 
THE INFLUENCE OF SELECTION 
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7. A parallel with a previous experiment. This. experi- 
ment, it will be seen, forms a parallel to that inquiry (also 
by Stephenson) described in Chapter I, Section 9, where 
tests fell into two types, verbal and pictorial, with correla- 
tions falling there as here into four quadrants. Tf we call 
the two types of school pupil here the linguistic (L) and 
the scientific (S), and again use C for the cross-correlations, 


the diagram corresponding to that on page 16 of Chapter I 
is: 


The chief difference between the two cases is that there 


the cross-correlations, though smaller than hierarchical 
order in the whole table would demand, were nevertheless 


positive. Here, however, the cross-correlations are 
actually negative. 


It is true that the si 
quadrants can in either case be rever: 


re pupils). But that is not really 
Se. We have no doubt which is 


In Chapter I we explained the 
tetrad-differences by the h 
factors g, v, 


correlations and their 
ypothesis of three uncorrelated 
d in various proportions by the 
in various amounts: by the children. 
The loadings which indi ted the proportions of the factors 
Ssumed to be all positive, Thur- 
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stone expressly says that it is contrary to psychological 
expectation to have more than occasional negative loadings. 

8. Negative loadings.—Let us endeavour to make at least 
a qualitative scheme of factors to express the correlations 
between the pupils, factors possessed in various amounts 
by the subjects of the school curriculum, and demanded 
in various proportions by each pupil before he will call 
the subject interesting. One type of pupil weights heavily 
the linguistic factor in a subject in evaluating its interest 
to him. The other type weights heavily the scientific 
factor in a subject in judging its attraction for him. But 
to explain actual negative correlations between pupils we 
must assume that some of the loadings are negative, 
assume, that is, that some of the children are actively 
repelled by factors which attract others. Common sense 
does not think thus. Common sense says that two children 
may put the subjects in opposite orders, even though they 
both like them all, provided they don’t like them equally 
well. But then common sense is not anxious to analyse 
the children into uncorrelated additive factors. If each 
child is thus expressed as the weighted sum of various 
factors, two children can correlate negatively only if some 
of the loadings are negative in the one child and positive 
in the other, for the correlation is the inner product of the 
loadings. Since Stephenson has found numerous nega- 
tive correlations between persons, and since few negative 
correlations are reported between tests, we seem here to 
have an experimental difference between the two kinds of 
correlation, and if ever correlations between persons come 
to be analysed as minutely and painstakingly as correla- 
tions between tests, it would seem that the free admission 
of negative loadings would be necessary.* The present 
matrix can in fact be roughly analysed into two general 
factors, one of which has positive loadings in all pupils, 
while the other is positively loaded in the one type, 


negatively loaded in the other. : i 
9. An analysis of moods.—A still more ingenious appli- 


cation by Stephenson of correlations between persons is in 
an experiment in which for each person a “ population " 
* See Stephenson, 1936b, 349. 
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in the complete population—we must, of course, in each 
case define what we mean by the complete population, as 
for example all living adults who were born in padi 
given by X, x, Zs, and X, and their correlations $4 
Ry, Ris, ete. Now let a selection of persons be made who 
are more homogeneous in the first quality—say, in ou 
intelligence test which has been given to them all—so tha 
its standard deviation in the sample is only c, and write— 
Cal 


See 


The smaller Pı is, the more homogeneous the group is in 
intelligence-test score. If we write— 


a = V(1-— p) 
% will be larger, the 


li = qty 
For the sort of reason indicated earlier in this paragr: p 
the correlations of the four qualities—which we are 2 
simplicity in exposition assuming to be positively correlate 


in the whole Population—will also alter, according to the 
formula— 


fac R; — did; 
id PP; 


represented by lines all crossing each 
average man,” and at angles with one 
another whose Cosines equal the correlation coefficients 
between the tests (see Chapter VI). 


In this Perspective figure let OA, OB, and OC be three 
lines in three-fold 


: à Space representing three tests. „The 
triangle 4BC is in a Plane at right angles to O4. Write— 
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cos « = cos BOA = Hy, 
cos B = cos COA = Ry 
cos y = cos BOC = Hs, 

Take the distance OA as unity. Each test is standard- 
ized, so that its standard deviation is unity. Now let the 
standard deviation of Test 1 be reduced so that it becomes 
pi =OD. This means, in our geometrical model, that the 
whole three-fold space in 
which our lines OA, OB, 
and OC exist is compressed 
from A towards O, and 
every line parallel to this is 
shortened in the same way. 
The point B moves up to 
E, and the point C to F. 
The whole triangle ABC is 
lifted up, remaining at 
right angles to the line O4, 
to a new position DEF. 
The test lines OB and OC 
become OE and OF. The 
angle y — BOC has become 
the angle y' — EOF, and 
cos y' represents the new Figure 31. 
correlation coefficient be- 
tween Tests 2 and 8. Our object is to find cos y' in terms 
of the known quantities o, B, Y» and p. One method is to 
express BC? in terms of the triangle BOC, and EF* in terms 
of the triangle EOF, and equate them, since BC — EF. 


First note that 
OB: — OE? = 0A? —0D: —1—p-—4q 


and similarly OC? — OF? = q? 
Also p> = OE/OB, and p, = OF /OC 
OB? — OF? 
Further, g =1— Pa OB? q/0B* 
and similarly qs? = q/0C* 
Now, since 


BC: = OB: + OC: — 20B.0C cos y 


and EF: = OE? + OF: — 20E.0F cos y' 
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of thirty moods, such as “ irascible,” * cheerful," ** sunny," 
were rated for their prevalence and intensity for each of 
ten patients in a mental hospital, and for six normal 
persons (Stephenson, 1936c, 363). This time the correla- 
tion table indicated three types, corresponding to the 
manic-depressives, the Schizophrenes, and the normal 
persons, each type correlating positively within itself, but 
negatively or very little with the other types. These 
experiments were only illustrative, and it remains to be 
seen whether factors which will prove acceptable psycho- 
logically will be isolated in persons in the same manner as b 
and the verbal factor, have been isolated in tests. The 
parallel between the two kinds of correlation and analysis 


is, however, certainly likely to throw light on the nature of 
factors of both kinds. 


CHAPTER XVII 


THE RELATION BETWEEN TEST FACTORS 
AND PERSON FACTORS 


1. Burt’s example, centred both by rows and by columns.—In 
the examples we have just considered, there is no doubt 
that correlations between persons can be calculated without 
absurdity. In the matrix of marks given by a number of ex- 
aminers (marking the same paper) to a number of candidates, 
either two candidates can be correlated or two examiners. 
The heterogeneity of marks referred to in Chapter XVI, 
Section 1, does not enter as a difficulty. Still keeping to 
such material, let us ask ourselves what the relation is 
between factors found in the one way, and factors found in 
the other. Qualitatively, we have already suggested that 
factors and loadings change réles in some manner. The 
most determined attempt to find an exact relationship has 
been that made by Cyril Burt, who concludes that, if the 
initial units have been suitably chosen, the factors of the 
one kind of analysis are identical with the loadings of the 
other, and vice versa (Burt, 1937b). The present writer, 
while agreeing that this is so in the very special circum- 
stances assumed by Burt, is of opinion that his is a very 
narrow case, and that the factors considered by Burt are 
not typical of those in actual use in experimental psycho- 
logy. Theoretically, however, Burt’s paper is of very great 
interest. It can be presented to the general reader best 
by using Burt’s own small numerical example, 
matrix of marks for four persons in three tests : 


based on a 


Persons a b c d 

1 —6 2 0 4 

Tests 2 8 I IM tO: 
3 3 —8 1-1 


It will be noticed that this matrix of marks is already 
centred both ways. The rows add up to zero, and so do 
263 
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the columns. The test scores have been measured ET 
their means, and then thereafter the columns of person R 
scores have been measured from their means; or = a 
be done persons first, tests second, the end-result pac 
the same. Burt does not give the matrix of raw sco 
ich the above matrix comes. ; : 

MEL UA the doubly centred matrix as he gives it, the 
matrices of variances and covariances formed from it are : 


Test Covariances 


| 1 2 3 
l 
1 56 — 28 — 28 
2 |— 28 20 8 
3 |— 28 8 20 


Person Covariances 


| a b c d 
| 


a | 54 —18 0 — 36 
b |—18 14 —4 8 
c 0 —4 2 2 
d |— 36 8 2 26 


Notice that in both t 
zero, just as they do i 
“ centroid ” process. 

2. Analysis of the covarian 
analyse each of these b 
clear that there wil] exi 


hese matrices the columns add to 
n the matrices of residues in the 


ces.—Burt next proceeds to 
y Hotelling's method. It seems 


St some relation between the two 
analyses, since the primary origin of each matrix is the 


same table of raw marks, and to show that relation most 
clearly Burt analyses the covariances direct, and not the 
correlations which 


could be made from each table (by 
dividing each covari 


rotation would here be the sa 
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Analysis of the Tests 
Mi= 2/145, 
T = —Vi4y la V 6x 
esi EAT = V 672 


Analysis of the Persons 
a — —8V6f, 
DC SIR OV 
c= — V2 f, 
dc af c vafa 
In both cases two factors are sufficient (there will always 
be fewer Hotelling or centroid factors than tests with 
a doubly centred matrix of marks, for a mathematical 
reason). The reader can check that the inner products 
give the covariances, e.g.— 
covariance (bd) = 4/6 x 2/6 — 2/2 X y2 —12—4-—8 
The method of finding Hotelling loadings was described 
in Chapter VII, and the reader can readily check that the 
coefficients of y, for example, do act as required by that 
method. For if we use numbers proportional to 24/14, 
— A4, and — 4/14, namely 1, — 4, — 4, as Hotelling 
multipliers we get : 
56 — 928 — 28 1 
— 28 20 8 |—i 
— 98 8 20 |—4 


56 — 28 — 28 
Zs SIO = 
14994 a) 


84 —42 — 42 
proportional to 1 — — das required. 
The largest total (84) is the first “ latent root,” and the 
multipliers 1, — 4, — $ have to be divided, according to. 
Chapter VII, by the square root of the sum of their squares,’ 
and multiplied by the square root of 84, giving— 
24/14 —Vl4 —y14 


F.A. —9* 
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Now according to the reciprocity principle, if we analyse 
instead the correlations between the persons, find factors 
which we may indicate by Greek letters, and measure the 
amounts of these possessed by the eleven traits, these 
amounts ought to be the same as the saturation coefficients 
of the Roman factors U, 7, etc. 

Burt therefore further standardizes the assessments, 
by persons this time, and finds the total scores on each 
trait, which are, by a property of centroid factors (see 
page 217) proportional to the amounts of a centroid Greek 
factor possessed by the eleven traits ; and the test of the 
reciprocity hypothesis is to see whether these totals are 
similar to the saturations of a Roman factor. The figures 
(from Burt’s page 405) are given in the table below : 


Saturations of the | Amounts of the 

Roman factors Greek factor 

u v a 
Sociability 3 B : 671 -508 587 
Sex. : " , ; “878 218 489 
Assertiveness , ; 3 827 488 378 
Joy. . i * A “951 238 297 
Anger, : E . -824 "241 :280 
Curiosity . . ; s 780 | — -268 *001 
Fear 5 Ir MEE 898 | — -159 — :089 
oro Ea NP os ';259 | — 494 — 837 
Tenderness E ò : 564 | — -667 — 447 
Disgust Ne: D 7 " 830 | — -490 — :489 
Submissiveness , 412 — «685 — +525 


Clearly the amounts of g do not correspond to the 
saturations of u; not should they, for a general factor 
ated by the double standardization. 


PART VI 
THE INFLUENCE OF SELECTION 


CHAPTER XVIII 


THE INFLUENCE OF UNIVARIATE SELECTION 
ON FACTORIAL ANALYSIS* 


1. Univariate selection —All_ workers with intelligence 


tests know, or ought to know, that the correlations found 
between tests, or between tests and outside criteria, depend 
to a very great extent indeed upon the homogeneity or 
heterogeneity of the sample in which the correlations were 
measured. If, to take the usual illustration, we measure 
the correlation between height and weight in a sample of 
the population which includes babies, children, and grown- 
ups, we shall obviously get a very high result. If we 
confine our measurement to young people in their "teens, 
we shall usually get a smaller value for the coefficient of 
correlation. If we make the group more homogeneous 
still, taking, say, only boys, and all of the same race and 
exactly the same age, the correlation of height and weight 
will be still less. ‘Through all these changes towards 
greater homogeneity in age, the standard deviation (or its 
square, the variance) of height has also been sinking, and 
the standard deviation of weight also. The formule which 
describe these changes were given in 1902 by Professor 
Karl Pearson,{ and when the selection of the persons 
he sample is made on the basis of one quality 


forming t 
t into the following very 


only, these formule can be pu 


simple form. 
Let the standard deviations of (say) four qualities be 


* Thomson, 1937 and 1938b. 
f Greater homogeneity need not necessarily, in the mathematical 


sense, decrease correlation, and occasionally it does not do so in 
actual psychological experiments. But it almost always does so. 

f These formule are not, as was once thought, only applicable if 
all distributions are normal (see Lawley, 1943c, where the necessary 
conditions are stated). They have been found by trial to give good 
results even when the sample has been made by cutting off a tail, or 
both tails, of the distribution. 
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we have, subtracting, 


8 3. OF cos y" 
—(0B'— OE?) -(0C:— OF?) —20B.0€C cos y--20E i 
e Q? p Q? —20B. 0C cos y+20E.0F cos y 
whence 
OB.OC cos Y—-n 
^ OE.OF 


cos y' = 


a, h 
soy- oc 


OE OF 

OB OC 
SECOS Ys 
f; Peps 
Toh mac. — hus 

P2Ps 

8. A numerical example.—Let us define our “ whole 

population " as all the eleven-year-old children in Mas- 


sachusetts, and let us Suppose (the numbers are entirely 


fictitious) that the standard deviations of all their scores 
in four tests are : 


or 


1. Stanford-Binet test 16:5 — NG 
2. The X reading test 24-9 = Y. 
3. The Y arithmetic test 27-8 = Y, 
4. The Z drawing scale — 14-9 — X 


while the correlations between these four, in a State-wide 


Survey, are (these are the R correlations) : 
| 1 2 3 4 
1 : *69 UD :32 
2 *69 6 -54 18 
3 "UD ‘54 é -06 
4 


:33 18 06 


Now let a sample of Massachusetts eleven-year-olds be 
taken who are less wide! 


a standard deviation in their Stanford-Binet scores of 
only 102. How will all the other quantities listed above 
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tend to alter in this sample? We have, using the formulz 
quoted, the following— 

_ ie 618 
B—165 
q = V — 618?) = -786 
and from q; = qR we have the other shrinkages q, and 
thence the coefficients p and the new standard deviations 


c =p: 
1 2 3 4 


q 786 -542  -590 -252 
p *618 -840 -808 -968 
o 10:2 209 22-1 13-7 

The formula for r; then enables us at once to calculate 
the correlations to be expected in the sample, namely : 


| 1 2 3 4 
1 > -509 -574 204 
2 -509 : 825 054 
3 “574 825 . — 118 
4 


-204 — :054 — -113 


The greater homogeneity in the sample has made all the 


correlation coefficients smaller, and has indeed made ra, 


become negative. 
The reader should note that these standard deviations 


and correlations are what result from selecting on the Stan- 
ford-Binet test, letting the other changes happen in con- 
sequence. It would be quite a different matter to select on 
the X reading test. Even if we did so, so as to reduce the 
reading test standard deviation from 24-9 to 20:9 as 
happened above, the other changes would be quite differ- 
ent. The Stanford-Binet standard deviation would, for 
example, not be reduced to 10-2 but only to 15:3. And 7, 
would not be :574, but -722. The difference, in terms of 
our Figure 31, is that whereas selecting the Stanford-Binet 
corresponded to shortening the line OA and with it all 
parallel distances in the space, selecting the reading test 
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corresponds to shortening OB and all distances parallel to 
it: quite a different distortion of the space. M. 

4. From sample to population.—In the above tumeg 
example we supposed that the standard deviations an 
correlation coefficients were known in the whole popula- 
tion of Massachusetts eleven-year-old children, and asked 
what they would become in a sample with a smaller scatto 
in the Stanford-Binet score. The problem might, however, 
be reversed, in which case, with a little care, the same 
formula can be used. 

Let us suppose that we know from experiment the above 
facts about the sample—the standard deviations 10:2, 
20:9, 22-1, 18-7, and all the correlation coefficients in the 
table -509, ‘574, ete.—and that we know further that the 
standard deviation of Stanford-Binet scores in the whole 
population in question is 16-5. 
worked with is obviously a biased d 
of Stanford-Binet Scores, and we wish to estimate what ai 
correlation coefficients would have been if we had testec 
all Massachusetts eleven-year-olds, or, at least, an un- 
biased sample. We want, indeed, to work the above 
example backwards. 

The quantity Pı is, 
namely— 


The sample we PR 
one, restricted in rang 


in this direction, greater than unity; 
16:5/10-2 — 1.618 
GS —1—9-— — 1.617 


The quantity q is therefore the square root of a minus 
quantity, which we express as— 


% = ¥(1-617)i = 1-272, where i = V— 1 


The other qs can be got from % by the same formula as 


before, namely q; = &. Re. where R not) means a correlation 
coefficient in the sample. Thus— 


and 


Jo = GR» = 1-272 x -509 — -647i 
% = (Ri, = 1-972i x -574 — -730i 


Il 


Then— 


Pot =1— G2 =1 -647° (for i — 


—1)—1419; p,—1-191 
and similarly fs = 1-238, 
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We then have— 
Bog — qq, _ "325 — 647i x “780% 


T a 
2 Pods 1191 x 1:288 
325 + -472 
= 5 -4 47: RR 
1-475 


as in the table for the population. In this way that table 
can be completely reconstituted. It is then, of course, 
only an estimate and, moreover, an estimate based on the 
assumption that our sample differs from the population 
only by reason of one of the four variables—namely, the 
Stanford-Binet score—being restricted, deliberately or 
accidentally, the other restrictions being supposed to have 
followed sympathetically by reason of the correlations. 
In few practical examples can we be sure of the mode of 
selection. 

5. Variance of differences between scores.—Our numerical 
example enables us to illustrate a very useful fact, that the 
variance of the differences between the scores in two tests 
is independent of the amount of selection if both tests have 
been equally shrunk, and is reasonably constant when this 


condition is not too much departed from. 
For example, o? for the differences between the scores in 


Tests 2 and 3 would be, by the formula— 
0325 = Og? + O3? — 2930303 
equal in the population to— 
24:92 + 27-83 — 2 X 249 x 27:3 X -54 = 6381-15 


and in the sample to— 
20-92 + 22-1? — 2 X 20:9 X 22-1 x -825 = 625-0 


that is, almost the same, although p, does not quite equal 
ps. This fact gives another method of estimating a popu- 
lation correlation if the sample correlation between 
differences can be calculated, and if the standard devia- 
tions in the population are known or can be guessed. For 
example, suppose a worker with the sample calculated 


from his data the value— 
o2, = 625 
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and had reason to think that in the population, or in some 
other sample, the standard deviations were 25 and 27 ie 
they nearly are in our example), he could estimate the 
unknown correlation as— 
2 x 25 x 27 — 625 
2 X 25 x 27 


Actually it was -54. But this method would fail badly if 
the quantities p, and p; were markedly different (Emmett, 
1951, B.J.P.Statist., 4, (1). 

6. Selection and partial correlation.—If a sample is mag 
completely homogeneous in the Stanford-Binet test, clearly 
Pı =0 and q, =1. The same formule then give us : 


= 537 


1 2 3 4 
q 1 69 75 32 
p 0 :524 — 488 -904 
co 


0 13-0 11-9 12-8 
and the resulting correlation co 
are called “ coefficie 
Stanford-Binet score 


efficients, which in this e 
nts of partial correlation for consta? 
;" are, by the same formula : 


1 2 8 4 
H 2 1 
2 : 098 — -086 
3 +098 . — 455 
4 + —:086 — .455 


The correlations of the Stanford-Binet test with the 
others are given by the formula as 0/0, that is, indc 
minate. That they are really zero is seen from the ee 
taken as not quite zero, but very small, 


ons come out by the formula as very small. 
They vanish with p,. 


In this special case of ** 
directly selected test is so st 
in the sample has exactly th 


partial correlation,” where bie 
ringently selected that everyon 
€ same score in it, our formula— 


ry = Fu — a 


PP; 
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has a more familiar form. For since— 


li = OR, 
and Gm =1 
in this case of complete shrinkage we have— 
qi = Rii 
and pi = v — Re) 
so that our formula becomes— 
Ry — Eus 


"i = JA — Ry) va — Ry) 
the usual form of a partial correlation coefficient. Its 
more conventional notation is, calling the test which is 
made constant Test & instead of Test 1— 


pes Ti — Tui 
A — ral) V. ge) 
If the “ test? which is held constant is the factor g, 
this becomes— 


T Rot Tij — Tia io 

ir — ET ny) v — ng) 

which is called the “ specific correlation " between i and j. 
Its numerator is the “ residue” left after removing the 
correlation due to g. If g is the sole cause of correlation, 
holding g constant will destroy the correlation and we shall 


have— 


T 


Ty = Talia 
as we already saw from another point of view was the case 
in a hierarchical battery, in Section 4 of Chapter I. 

7. Effect on communalities.—The formula— 
cx Rj — qid 
UE aa 
PP; 

is thus a very useful formula, including partial correlation 
as a special case. If the original variances are each taken 
as unity, the numerator Ry — qid for i + j gives the new 


covariances, while p; and pj are the new variances. 
It also includes as a special case the formula known as 


the Otis-Kelley formula, which is applicable when two 
variates have both shrunk to the same extent (a restriction 
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not always recognized). If we put q; = qj and therefore 
Pi = p; it becomes— 


" o? o2 f 
pit the Otis-Kelley formula. 
ms X xXx 


It has a still further application (Thomson, 19385, 456), 
for if a matrix of correlations in the wider population m 
been analysed by Thurstone's process, this same T 
gives the new communalities (with one exception) to T. 
expected in the sample, if we put i = j and understand i 
Ri, the communality in the wider population, by Tiis the 
communality in the sample (and not a reliability coefficient, 
which is the usual meaning of this symbol). Writing the 
usual symbol h? for communality we have the formula in 
the form— 
Hà — qq 

p? 

The exception is the new 
quality which has been direc 
No. 1 the Stanford-Binet scor 
trait the new communality is 


p= 


(@=2,8,4..,) 


communality of the trait 14 
tly selected, in our examp d 
es. For the directly selecte 


given by— 
D 
hà = £514 C 
a 1 — gH 


(Thomson, 1938b, 455; and see also Ledermann, 19380). 


nerical example.—We shall take, in 
perfectly hierarchical example of our 
Chapter I. But to Save space in the tables we shall con- 
sider only the first four tests. Their matrix of correlations, 
with the one common factor and the four specifies added, 


and with communalities inserted in the diagonal cells, was 
as follows : 
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n gt 2 3 4 eo ay By A 
1 | c8) «42 68 54 | 90 s 
2 | 72 (64) 56 -48 | -80 60. 
3 | -63 -56 (49) 42 | 70 irm 
4 | -54 48 42 (36) | -60 mo 
8 | 00 80 -70 60 {100 . . 
MI TURO sellos Cai 100 . 
sa eleven TETAN 
8 "- Set c PED TET OCR 
s -80 1-00 


The bottom right-hand quadrant shows, by its zero 
entries, that the factors are all uncorrelated with one 
another, that is, orthogonal. The tests expressed as linear 


functions of the factors are— 


z = 9g + 4365, 
zy = ‘8g + 6005, 
Zg = "7g + 71483 
za = 6g + 80054 


These equations are 


same facts as are shown in the nort 
west, quadrant of the matrix (where on 


decimals are used for 
printing regular). 
Let us now suppose 


refer to a wide and defined population, 
chusetts eleven-year-olds, 
most likely matrix of correl 


factors to be found in 


Test 1 so as to be more homogeneous. 


only another way of expressing the 
h-east, or the south- 
ly two places of 


the specific loadings, to keep the 


that this matrix and these equations 


e.g. all Massa- 
and let us ask what will be the 
ations between these tests and 


a sample chosen by their scores in 
The variance of 


Test 1 in the wider population being taken as unity, let 


us take that in the more h 


being p,? = ‘36. ° We 


treating g and the spec 


omogeneous select sample as 


then have, using q: = pR and 
ifics just like tests, the following 


table : 

| 1 2 3 4 | g Sj. Say 1531 094 
q -80 -576  :504 432 "720 -349 A 
p 60 -817 ‘864 -902 694 937 1 1 1 
p? (variance) | -36 -668 746 :818 482 878 1 1 1 
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: s : i 
For the correlations and communalities, using ou 
formula— 


R;— 4G 
PP; 
we get (again printing only two decimal places) : 
1 2 3 4 | a 5, Sa S3 S4 

l | (61) +53 44  .3g | "78 -28 

2 SEC (GE es Eg o ET eS 73 

3 | 44 88 (32) .96 | 56 —.99 8: P 
4 | 36 31 26 (23) | 46 —18 . a d) 

| 


g 78 68 +56 .46 1:00 —-39 
5; :28 — 26 —.92 3g 

E z "US . 

$$ itl ie : “83 . 
Sa | 6 e E *89 


1:00 


1-00 


are with the tests ; and on examination of the matrix we 
See that these, wh 


still give the rest of the matrix, Thus— 

78 X 46 = 36 (n, 
“68 = 46 (It 
of rank 1 (Thomson, 1988b, m 
become the diminishe 
or required by rank 1. 


nd with g, that is, g, $5 Ss 
and s, are still orthogonal. 


But something has happened to the specific s. It has 
become correlated with g, and with all the tests. It has 
become an oblique factor, orthogonal still to the other 


Specifies, but inclined to g and the tests. It leans further 
away from Test 1 than it formerly did, and makes obtuse 
angles (negative correlation) with the other tests and with £ 
to which it was originally orthogonal, 
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But since, as we have already pointed out, the test matrix 
with the reduced communalities is still of rank 1, it is 
clear that a fresh analysis could be made of the tests into 
one common factor and specifics, thus— 


x! = 778g! + -628s,' 
m, = -079g' + “7345, 
zy = :562g' + ‘8278 
um, = 462g' + ‘88734 


ll 


In these equations the factors g’, S1, $» Ss, and s, are 
again orthogonal (uncorrelated), and the loadings shown 
give the correlations and give unit variances. This is the 
analysis which an experimenter would make who began 
with the sample and knew nothing about any test measure- 
ments in the whole population. 

The reader, comparing the loadings in these equations 
with the correlations in the matrix of the sample, will 
rightly conclude that the specifies from s; onward have not 
changed. In the matrix it is clear that they are still 
orthogonal, and their correlations with the tests, in the 
matrix, are the same as their loadings in the equations. 
The tests are, in the sample, more heavily loaded with these 
specifies than they were in the population, but the specifics 
are the same in themselves. 

The new specific s,’ the reader will readily agree to be 
different from s, The latter became oblique in the 
sample, whereas s’ is orthogonal. What now is to be said 
about the common factors g (in the population) and g’ (in 
the sample)? From the fact that the loadings of £', in the 
sample equations, are identical with the correlations in 
the sample matrix of the original g with the tests, one is 
tempted to imagine g' and g to be identical in nature. But 
that is not so certain. 

If we go back to the equations of the tests in the popu- 
lation, we can rewrite them in the following form— 


zı = 467g" + 800g" + 8775)’ 
za = 555g" + -576g" + 6005; 
za = -485g' + 504g" + “714s, 
za = 417g" + 4828" + :8005, 


ll 


I 
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i on 
with two common factors g' and g” instead of one comm 


factor g. These equations still give the same correlations. 
For example— 


Tj, = 467 X -417 + -800 x -482 = -540 as before. 


In these equations the specifies s, s, s, are the same, p 
the communalities of Tests 2, 3, and 4 are the same. 
that we have done in these three tests is to divide EC 
common factor g into two components. The ratio of t 4 
loading of g" to the loading of g' is the same in each o 
them. The loadings of g” we have made identical with the 
shrinkages q in the table on page 285. 

In Test 1 also we have made the loading of g" equal a 
the shrinkage % =:8. But in this test g” cannot be us 
upon merely as a component of g. To give the M 
correlations, the loading of g' has to be -467 as shown, an 
the communality of Test 1 has been raised from its former 
value (:81) to— 

467? + -800* = -858 


g of the specifie has correspondingly su 

are a totally new analysis ©: 
Part of the former specific has 
mmon factors. 


z) are— 
Variances i 
= 467g" + 877s -360 
&y = :555g' + -600s, -668 
Ug = -485g +-7l4s, 746 
94 — 4179! + -800s, +813 


The reduced variances are the sum of the squares of the 
surviving loadings, e.g.— 


“4672 +- -8772 — :360 
The variances, it Wi 
as measured in the s 


equations is divided 
variance, we arrive at 


ll be Seen, are the p?’s of our tests 
eple. If each of the last set of 
through by the square root of its 
the equations— 


THE INFLUENCE OF UNIVARIATE SELECTION 289 


zm, = -778g' + -628s,' 
2, = -679g' + "1948, 

24 = :5629' + -827s, 

z, = +4629’ + -887s, 
which is the analysis already given as that of an experi- 
menter who knew only the sample. As to the nature of &, 
We can say in Tests 2, 3, and 4 that it is possible to regard 
it as a component of the g of the population. But we 
cannot do so with assurance in Test 1. There its nature is 
more dubious. At all events, it is not the same common 
factor as in the population, and at best we can say that it 
is one of its components. 

9. A sample all alike in Test 1.—These phenomena are 
still more striking if we consider a case where the sample 
is composed of persons who are all alike in Test 1. It 
would be an excellent exercise for the reader to calculate 
the resulting matrix of correlations for tests and population 
factors in this case. The tests act in this case as though 
their original equations in the population had been— 


" 


Ep = 
Za = 18498' + 7208" + -600s, 
Z3 = -805g' + -630g" + ‘T7148, 
Za = :2628' + -540g" + -800s, 
and then g” had become zero, i.e. a constant with no 
variance. 


constant does not hold its factors 


g and s, constant. They can vary in the sample from 
man to man, but since— 


z = 9g + 4365, 
remams constant, a man in the sample who has a high g 


must have a low s,—that is, these factors are negatively 
correlated ih the sample. And because they are thus 
negatively corre] 


A ated, those members of the sample who 
have high £'s, and who will therefore tend to do well in 
Tests 2, 3, and 4, will tend to have values below average 
(negative values) for their $5, Which will be therefore 


negatively correlated with these tests, in this sample. 
T.A,—]10 
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So far in our examples we have assumed the sample E 
be more homogeneous than the population. But a sample 
can be selected to be less homogeneous. In such a case 
the same formule will serve, if we simply make the capital 
letters refer to the sample and the small to the population. 
In fact, the same tables, with their róles reversed, can 
illustrate this case. In practical life we usually know which 
of two groups we would call the sample, and which the 
population. But mathematically there is no distinction, 
the one is a distortion of the other, and which is the “ true 
state of affairs is a question without meaning. 


It must also throughout be remembered that all these 
formulæ and statements refer 


are certain to follow, 
expected. If actual s 
mentally found in t 
loadings, ete., w 


; not to consequences which 
but to consequences which are to be 
amples were made the values exper! 
hem for correlations, communalities. 
ould oscillate about those given by our 
formule, Violently in the case of small samples, only 
slightly in the case of large samples. 

10. An example of rank 2.—The above example has only 
one common factor. We turn next to consider an example 
with two. Again it is, we suppose, the first test according 
to which the sample is deliberately selected, and agam 
we suppose the * shrinkage ” qı to be -8. The matrices 
of correlations and communalities, in the population an 
in the sample, are then as follows, the two factors f; and J» 
and the specifics being treated in the calculation exactly 


as if they were tests: To economize room on the page 
we omit the later specifi 


CS 
Correlations in the Population 
ONDE ME: 4 5 WO P TRUE, 
puer IE. VILLE td, TEM 
1 | (65) -46 59 36 4 | 70-40 - 891. 5 
2 46 (37) -36 -26 23 ‘60 -10 : 19 
3 POO MCG) E ao TIE | 302-6001 : 
4 36 26  .32 (20) -22 40 -20 : 
5 4l 28 45  .99 (:34) :80 — -50 . : 
ee a ald ee We 
f TOS GOS 50a 0248 :30 (1:00) . . * 
f 40 10 -60 20 50 : 430500) ^ 5 
S; -59 ‘ : . (1:00) 
S, $ ‘79 f 


———————aa—————— a 
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Correlations in the Sample 


| 1 z 3 4 5 h fs Sı So 
1 | (40) -30 -40 -28 -26 51 995/2400 =. 
2 :80 (27) :98 37. 12 51 —:02 —-21  :85 
3 ‘40 -28 (50) 22 -35 82 54 —-29 
4 28 17 22 (18) -14 | |:80 12 —-16 
5 26 12 35  -14 (20) | 15 -44 —-19 
ip SI 5I 79829 -80 515 (15:00) 285-36 
ipo |) eis Sip) eae TE | —.23 (1-00) —-18 
s | 40 —-21 —-29 —-16 —-19 | —-36 —-18 (1:00) 
So ‘85 ! | (1-00) 


We see here a new phenomenon. The two common 
factors f, and f; in the population were orthogonal to one 
another, as is shown by the zero correlation between them. 
But in the sample they are negatively correlated (— -228) ; 
that is, they are oblique. We begin to see a generalization 
which can be algebraically proved, that all the factors, 
common and specific, which are concerned with the directly 
selected test(s) become oblique to each other and to all the tests, 
but the specifics of the indirectly selected tests remain orthogonal 
to everything, except each to its own test. 

But the matrix of the tests themselves is still of rank 2, 
and an experimenter working only with the sample would 
find this out, although he would know nothing about the 
population matrix. He would therefore set to work to 
analyse it into two common factors, orthogonal to one 
another. A Thurstone analysis comes out in two common 
factors exactly, and can be rotated until all the loadings 
are positive. For example : 


Test | 1 2 3 1 5 
Factor fi 570 -521 -436 -3382 -288 
Factor fa :2776 : 555 -180 -452 


These factors f’, however, are clearly a different pair 
from the factors f in the original population. In the 
sample, those original factors (f) are oblique; these (f’) 
are orthogonal. | 

Again the whole phenomenon is reversible. The second 
matrix (with the orthogonal factors f") might refer to the 
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population, and a sample picked with a suitable increased 
scatter of Variate 1. All our formule could be worked 
backwards, and we should arrive at the matrix beginning 
(65), referring now to the sample. The f’ factors would 
have become oblique, and a new analysis, suitably rotated, 
would give us the other factors f. ) 
It becomes evident that the orthogonal factors we obtain 
by the analysis of tests depend upon the subpopulation we 
have tested. They are not realities in any physical sense 
of the word ; they vary and change as we pass from one 
body of men to another. It is possible, and this is a hope 
hinted at in Thurstone’s book The Vectors of Mind, that if 
we could somehow identify a set of factors throughout all 
their changes from sample to sample (in most of which 
they would be oblique) as being in some way unique, we 
might arrive at factors having some measure of reality 
and fixity. Thurstone, in his latest book Multiple Factor 
Analysis, believes that he has achieved this, and that his 
oblique Simple Structure is invariant. His claim is con- 
sidered in our next chapter. It is, in the present writer’s 
Opinion, justifiable only for univariate selection, not for 


multivariate, which is not merely repeated univariate 
selection. 


11. ' Random selection.—These 
the resul 


test is changed to some 
ces and the changed 
1 by our formula— 


desired extent, 


The new varian 
correlations of thi 


€ other tests giver 


If we selected a large 
all with the same red 
not all be alike in the re On the con- 
d But most of them would 
be like the expected set, few would depart widely from that ; 
and the departures would be in both directions, some 
0 » others on the other side, 
of our expectation. 
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If now, instead of selecting samples which are all alike 
in the variance of one nominated test, we take a large 
number of random samples of the same size, what would we 
find? Among them would be a number which were alike 
in the variance of Test 1, and these in the other part of 
the correlation matrix would have values which varied 
round about those given by our formula. We could also 
pick out, instead of a set all alike in the variance of Test 1, 
a different set all alike in the variance of Test 4, say ; 
and these would have values in the remainder of the matrix 
oscillating about our formula, in which Test 4 would replace 
Test 1. In short, a complex family of random samples 
would show a structure among themselves such that if we 
fix any one variance the average of that array of samples 
obeys our formula.* Random sampling will not merely 
add an “ error specific " to existing factors, it will make 
complex changes in the common factors. 


*On the authors suggestion, Dr. W. Ledermann has since 
proved this conjecture analytically (Biometrika, 1939a, 80, 295- 
304). His results cover also the case of multivariate selection (see 
next chapter). 
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dardized, the variances in this whole R matrix are all 
unity, and the covariances are simply coefficients of 
correlation. In our case the R matrix is : 


Analysis in the Population 


ib SIE ANUS 6 G cn WETRSEHMEID MCN S 
1 |1-00 "/2|-68 -54 -45 -36 -90 -44 . 
2 | 72 1:00 | -56 -48 -40 -32 -80 . -60 
3.63 -56 100 -42 .35 .98 70 TM TL s 
4| 54 -48| 42 1-00 -30 .24 .60 80 . 
PA O25000 e20 SOR a) 07 S^ y T 
ergoe 820 E2824 2011:00; M40. a so ts 692 
&| 90 -80| 70 -60 -50 40 1.00 ` 
SUAE STE V s 100 . 
S|. 00|. 100 . 
SML C eee 1:00 . 
S|. oa E e SO) Q ó Fa L00 F 
85| 1 ; EREN a . a * . 100 . 
& 6 SSH S e EROAA 3 2 s 5 . 1:00 
The R 


pp Matrix is the square 2 x 9 matrix, the Ru matrix 
the square 11 X 11 matrix, while R,, has two rows and 
the same transposed. 

at may be expected to happen 
when R, is changed to Vj, 
re first found by Karl Pearson, 
ix form in which we are about 


‘©: Aitken (Aitken, 1934), "The matrix 
changes to : 
Vp | Vee pee: Re 


ng recommendation to 
he whole calculation systematically. 


he first four tests we have— 
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683 -54 
FE, = [ 56-48 | 
-63 7:56 
E, = [ -54 -48 ] 
The most tiresome part of the calculation, if the number 


of directly selected tests is large, is to find Rpp ` the reci- 
procal of the matrix R,, such that the product— 


E 1 
TE i |=? 


where I is the so-called “ unit matrix ” which has unit 
entries in the diagonal and zero entries everywhere else. 
The method of doing this is given in Chapter XIV, 
Section 9, page 210. In the present example, where Rpp 
is only of dimensions 2 x 2, we soon find— 
ip = 2-0764  — 1-4950 
7»  — | — 14950 2:0764 
When the reciprocal matrix Ror has thus been calculated, 
the best way of proceeding is to find— 
à C = Ryp E, 
and D= Rp Rp © 


In the case of our example these are— 
cul 2979 —1-4050] [-63  -34] [ 4709," _:4087 
~ | —1:4950 2:0764 -56 48| [2209 1894 
D= 1:00 42] ][-63 56 "09 — 40837 
L2 1900 54 048 +2209 1894 
me | 61:00) 42] _ | 4204  -3604 
~L 42 100 +3604 3089 
_ |5796 0596 
~ L:0596 -6911 
subtraction of matrices being carried out by subtracting 
each element from the corresponding one. We next need— 


vV C= E: 8:1 ES 24] x (ee eral 
Es -30 -36 | | 2209 -1894 -2208 1893 
which gives us the new covariances of the directly selected 
tests with those indirectly selected. For V, we need still 
C'(V,,C) where the prime indicates that the matrix is 
transposed (rows becoming columns)— 


T.A.—10* 


CHAPTER XIX 


THE INFLUENCE OF MULTIVARIATE 
SELECTION * 


1. Altering two variances and the covariance.—In the pre- 
ceding chapter we have discussed the changes which occur 
in the variances and correlations of a set of tests, and in 
their factors, when the sample of persons tested is chosen 
according to their performance in one of the tests: we 
are next going to see the results of picking our sample by 
their performances in more than one of the tests, first of 
all in two of them. Take again, the perfectly hierarchical 
example of the last chapter. We must this time go as far 
as six tests in order to sce all the consequences. The matrix 
of correlations of these tests and their factors will be 
simply an extension of that printed on page 285. 


Now let us imagine a sample picked so that the variance 


of Test 1 and also that of Test 2 is intentionally altered, 
and further, their covariance (and hence their correlation) 
changed to some predetermined value. 

Tt is at once clear that in these two directly selected 
tests the factorial composition will in general be changed 
—can indeed be changed to anything which is not incom- 
patible with common sense and the laws of logic. What, 
however, will be the resulting sympathetic changes in the 
variances and covariances of the other tests of the battery ? 

In Chapter XVIII we altered the variance of Test 1 from 
unity to -36. The consequent diminution in variance to be 
expected in Test 2 was, as is shown on page 285, from 
unity to -668, and the consequent change in correlation 
from -72 to -53. Here, however, let us pick our sample so 
that the variance of the second test is also diminished to 
:36, and so that the correlation between them, instead of 
falling, rises to -883. We have, that is to say, chosen 
people for our sample who tend to be rather more alike 

* Thomson, 1937 ; Thomson and Ledermann, 1938. 
294 
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than usual in these two test scores, as well as being closely 
grouped in each, an unusual but not an inconceivable 
sample. Natural selection (which includes selection by the 
other sex in mating) has no doubt often preferred indi- 
viduals in whom two organs tended to go together, as 
long legs with long arms, and the same sort of thing might 
occur in mental traits. In terms of variance and covariance 
we have changed the matrix : 


1 2 
Tey 1200 A pee 
2 72 1:00 PP 
to the matrix : 
1 2 


1 | 86. 80 y 
2 -80 361 TEL 
+30 5 š : 
for ——— —— —— = 5 = 888, the new correlation. Notice 


(86 x-36) 6 
that the diagonal entries here (unities in R,, and -36, -36 
in V,,) are the variances, not the communalities. 
2. Aitken’s multivariate selection formula.—We shall 
symbolically represent the whole original matrix of vari- 


ances and covariances by : 


where the subscript p refers to the directly selected or 
picked tests, and the subscript g to all the other tests and 
the factors. R, (and also E,,) means the matrix of co- 
variances of the picked tests with all the others, including 
the factors. R,, means the matrix of variances and co- 
variances of the latter among themselves. Since at the 
outset the tests and factors are all assumed to be stan- 


Suppe echo] 


C 
e 


298 THE FACTORIAL ANALYSIS 


; "4709 -2209 
CnC) =| 4087 zd 


and then— 


Ve = D+ CV C= 


We now can write 


of Variances and covarian 
included the other tests and tk 
arrived at the whole new 13 
variances and covariances whi 
values calculated above for t| 


recognized in 
entries are vari 


:5796 
:0596 


"7394 
1966 


ces. 


:2858 
:2208 


OF HUMAN ABILITY 
:2022 1598 RU 
1893] "| 1370 -1175 
-0596 
aol + 
-1966 
-8086 


down the whole new 4 x 4 matrix 


1598 -1370 
:1370 1175 


In the same way, had we 
he factors, we would have 
X 13 matrix for all the 
ch we now print.* The 
he first four tests will be 
its top left-hand corner. (The diagonal 
ances, not communalities.) 


Covariances in the Sample 


e 


Ge Soe ET EN Sp 3 8$ $4 35; 1$ 
Se (ON | Lev oreet SER 33 -05 
i208 26) 8:22 03:19 16) is cas 04-18 
Le DATEN 
TOM np s a5 —14 —07 7i . 
320.319| .90' 81 44 ll .28 —12 —08 . 80 . 
HAC SL A iar 9. 5. 15 MMC o 
219 TS YS ccelo Mtr] es 9 —08 —04 .  . . 9 
94:82! «38 98 .98 4g 47 —49 10 
18 04 —14 —19 — 19 . og —19 40 .82 
05 18 |—-07 —.06 "os ii —10 .82 43 . 
di ode otra i by 100 . 
30 . 100 . 
287. 100 - 
P 92 1:00 
8. Features 
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its own test), are still of unit variance, and have still the 
same covariances with their own tests, though these will 
become larger correlations when the tests are restan- 
dardized ; 

(2) The specifics of the directly selected tests have 
become oblique common factors, correlated with everything 
except the other specifics ; 

(3) The matrix of the indirectly selected tests is still of 
the same rank (here rank 1) ; 

(4) The variances of the factors g, s, and s, have been 
reduced to :47, -70, and -43. 

An experimenter beginning with this sample, and 
knowing nothing about the factors in the wider population, 
would have no means of knowing these relative variances, 
and would no doubt standardize all his tests. He certainly 
would not think of using factors with other than unit 
variance. And even if he were by a miracle to arrive at 
an analysis corresponding to the last table, with three 
oblique general factors, he would reject it (a) because of 
the negative correlations of some of the factors, and 
(b) because he can reach an analysis with only two common 
factors, and those orthogonal. It is therefore practically 
certain that he will not reach the population factors, at 
least as far as the directly selected tests are concerned. 
His data and his analysis will be as overleaf. The variances 
are all made unity and the covariances converted into 
correlations. The analysis into factors is a new one, not 
derived from the last table. 

4. Appearance of a new factor —The most noticeable 
change in this sample analysis, as compared with the 
population analysis on page 296, is the appearance of a 
new “ factor ” h linking the directly selected tests, a factor 
which is clearly due entirely to that selection. What 
degree of reality ought to be attributed to it? Does it 
differ from the other factors really, or have they also been 
produced by selection, even in the population, which is 
only in its turn a sample chosen by natural selection from 
past generations ? 

Otherwise the analysis is still into one common factor 
and specifics. The loadings of the common factor are 
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No LEE OR eC. Th. st sts, SS; 
1 [100 -83| -46 -38 -30 -24 -82 -45 -35 . 
2 | 831-00) -43 -35 -28 -22 77 45 . 46 
$|.46 -43/1-00 -26 -21 -16 -56 . . . 89 . 
1:88 -35| 261.00 17 33 -46 . . . . «9 . 
DA ESOUEAS E IMEITZISO0RCLUNCSTLER a . 
OEA n22 LON 1805 01:3:000:9909 4 |) | 79. s 06 
£'|:82 -77| -56 -46 -37 -29100 . 
PIIRAB ABI Uis amm 100. 
Seaoil e Ay TR 1:00 
By] evo S LN UB cola ann iat N 
NECS rm cre c oo 2 
EBB edd DP T TELLE 
EX ^ A ee Ee 3 6 P ; . . 1-00 . 
Sell epe o Cul s OUS CP TENURE 


less than they were in the 
of variances and covari 
diminution in the varia. 
new common factor g' is 

The loadings of 8 
have been in p 
loadings of t 
entirely beca 
the shrinkag: 
being added. 

All these conside 
whether any factors 


population, and this,-as our table 
ances shows, is due to a real 
nee of the common factor. The 
a component of the old one. 

and s, have also sunk, because they 
art turned into a new common factor. The 
he other specifics have risen. But this is 
use the variance of the tests has sunk due to 
€ in g, and is not due to any new specifics 


also a given population of persons. 


Professor Thurstone, however, in his new book Multiple 


Factor Analysis (1947) gives what he mildly calls ** a less 
pessimistic interpretation than Godfrey Thomson’s of the 
factorial results of selection.” 

5. Identity of simple structure factors after univariate 
selection.—In that book, Thurstone discusses in Chapter 


XIX the effects of selection, and shows by examples that 
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if a battery of tests yields simple structure with oblique 
factors (including, of course, the orthogonal case), then 
after univariate selection the same factors (though at new 
angles with one another) are identified by the new structure, 
which is still simple. 

If, for example, the battery which gives the correlations 
on our page 152, and yields Figure 26 on page 158, has the 
standard deviation of Test 2 reduced to one-half, then by 
the methods described on our pages 296-8 we can calculate 
that the matrix of correlations and communalities becomes : 


1 D 3 4 5 6 
En 589 — 295 — 044 — 140 866 ` -000 
2 295 — 302 — -049 -159 -188 -000 
8 | —.044  -049 -555 115  -804 -506 
4 | —-140 -159 15 871 —-087 000 


| -866 183 -804 — -087 439 :822 
6 | *000 *000 506 “000 +322 493 


[n 


'The rank of this matrix is still 3 as it was before selection, 
and three centroid factors are found to have loadings— 


I II III 
1 409 647 058 
2 -879 ‘244 — 815 
3 569 — -444 “184 
4 160 — -271 — -522 
5 +585 174 “257 
6 -506 — -350 387 


When these are ** extended " in the manner of our page 157 
and a diagram like Figure 26 made, we obtain Figure 32. 
It is still a triangle, and although its measurements are 
different, the same tests are found defining each side as 
before. The corners of the triangle may, with Professor 
Thurstone, reasonably be claimed to represent the same 
factors as before selection, although their correlations have 
changed. 

The plane of Figure 32 is not the same as the plane of 
Figure 26, being at right angles to a different first centroid. 
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When adjustment is made for this, as Professor een: 
has presumably done in his chapter de RAM Le 
without sufficient explanation), then the direct ys 


Figure 32, 


test point has not moved, while the other points have 
moved radially away from or towards it. lti- 

If the above matrix of centroid loadings is postmu 
plied by the rotating matrix obtained from the diagram, 
viz.— 


7721 448 -641 
— 499 — -201 "744 
480: — 874 — -190 


We obtain the new simple structure on the reference vectors, 


[eer TA B C 
n i -732 
2 à 394 -484 
8 -562 -180 
E . -472 3 
5 -459 -455 
6 


"102 


THE INFLUENCE OF MULTIVARIATE SELECTION 303 


If this is compared with the table on page 154 it will be 
seen that the zeros are in the same places, although the 
non-zero entries have altered (except in Test 6, which was 
uncorrelated with the directly selected Test 2, and therefore 
is unaffected in composition). 

If the correlations between the factors are calculated by 
the method of pages 181-2, factor A is found to be still 
uncorrelated with B and C, but these last two have a 
correlation coefficient of — -8: that is, they are no longer 
orthogonal but at an obtuse angle of about 1074°. 

6. Multivariate selection and simple  structure.—But 
though Thurstone must, I think, be granted his claim that 
univariate selection will not destroy the identity of his 
oblique simple structure factors, but only change their 
- intercorrelations, the situation would seem to be very 
different with multivariate selection. 

Multivariate selection is not the same thing as repeated 
univariate selection. The latter will not change the rank 
of the correlation matrix with suitable communalities, nor 
will it change the position of zero loadings in simple struc- 
ture. Repeated univariate selection will, it is true, cause 
all the correlations to alter, but only indirectly and in such 
a way as to preserve rank, simple structure, and factor 
identity. 

But in multivariate selection it is envisaged that the 
correlation between two variables may itself be directly 
selected, and caused to have a value other than that which 
would naturally follow from the reduction of standard 
deviation in two selected variables. Selection for correla- 
tion is just as easily imagined as is selection for scatter. 
Indeed, in natural selection it is possibly even commoner. 

Once we select for the correlations, however, as well as 
for scatter, new “‘ factors " emerge, old ones change. In 
this chapter we have supposed a small part R,, of the whole 
correlation matrix to be changed to V,,, and found that 
one new factor is created (page 300) or, indeed, two new 
oblique factors (page 298). We might have supposed R,, to 
be a larger portion of R : and there is nothing to prevent 
us supposing selection to go on for the whole of R, and 
writing down a brand-new table of coefficients whose 
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“ factors " would be quite different from those of the origi- 
nal table. In our example of page 152, for instance, 
where the three oblique “ factors " coincided in direction 
with the communal parts of Tests 1, 4, and 6, there 1s 
nothing to prevent us from writing down, as € 
been produced by selection, a new set of. correlation coe a 
ents whose analysis would identify the ** factors with a 
communal parts of Tests 2,3, and 5. In fact, all we wou 
have to do would be to renumber the rows and columns aa 
page 152. Such fundamental changes could be produ 
by selection : and perhaps they have been, for natura 
selection has had plenty of time at its disposal. . 

Professor Thurstone (his page 458, footnote, in Multiple 
F'actor Analysis) classes the new factors produced by 
selection as “ incidental factors (which) can be classed 
with the residual factors, which reflect the conditions of 
particular experiments.” But we can hardly dismiss 
them thus easily if, as is conceivable, they have become 
the main or perhaps the only factors remaining, the others 
having disappeared ! 

Tt may be admitted at once, however, that the actual 
amount of selection from psychological experiment to 
psychological experiment is not likely to make such 
alarming changes in factors. For the use to which factors 
are likely to be put in our age, in our century or more, they 
are like to be independent enough of such selection as can 
80 on in that time, and in that sense Professor Thurstone 
is justified in his thesis. Nor am I one to deny “ reality ” 
to any quality merely because it has been produced by 
selection, and may not abide for all time. 


PART VII 
THE NATURE OF FACTORS 


CHAPTER XX 
THE SAMPLING THEORY 


1. Two views. A hierarchical example as explained by one 
general factor.—The advance of the science of factorial 
analysis of the mind to its present position has not taken 
place without controversy, and it is the purpose of the pre- 
sent chapter to give a preliminary description of some 
objections which have been frequently raised by the 
present writer (Thomson, 1916, 1919a, 1985b, etc.) which 
he still holds to. 

The contrast between the factorial point of view and 
'Thomson's sampling theory can be best seen by consider- 
ing the explanation of the same set of correlation coefficients 
by both views. To simplify the argument we shall take 
in the first place a set of correlation coefficients whose 
tetrads are exactly zero, which can therefore be completely 
* explained ” by a general factor g and specifics, as in this 
table : 


1 2 3 4 
il . “TAG 646 -527 
2 "A6 . 577 ATL 
3 646 "7T . 408 
4 :527 "T 408 . 


We can more exactly follow the argument if we employ 
the vulgar fractions of which these are the decimal 
equivalents, namely the following, each divided by 6: 


| 3 2 3 4 
TN ie ot, 
8 | V5 12 : 6 
4 | 4/10 /8 v6 


In this form the tetrad-differences are all obviously Zero 
by inspection. These correlations can therefore be ex- 
307 
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plained by one general factor, as in Figure 33, which gives 
them exactly. 


We have here a general factor of variance 30 which ie 
the sole cause of the correlations, and specific factors 


(36) l. 2. 


D. 


e 
(36) 
Figure 34. 


4. G 6 
Figure 36. 


Figure 35, 


variances 6, 15, 30, and 60. 


The variances of the four 
“tests” are 36, 45, 60. 


3 NN TT 

» and 90. The « communalities 

and “ specificities » are; 5 

Test 1 2 3 4 Totals 

Communality Ji BY eu Bo 30 Gel 2-333 
36 45 60 90 180 

Specificity Jt 1o Sm 60 goo = 1:667 
36 45 60 90 180 

EEUE 

Totals : 1 1 1 1 4 
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These communalities can be calculated from the corre- 
lation coefficients, for it will be remembered (Chapter I, 
Section 4) that when tetrad-differences are exactly zero, 
each correlation coefficient can be expressed as the 
product of two correlation coefficients with g (two 
“ saturations”). Thus— 


Tig = Tigl29 
Tig = 114139 
Tas = Tog'sg 


Therefore— 
RORO (yo 29) (igs) 
Tog (Togay) 
the square of the saturation of Test 1 with g. And when 
there is only one common factor, the square of its satura- 
tion is the communality. 

The quantity rj7;5/7», therefore, means, on this theory 
of one common factor, the communality, or square of the 
saturation with g, of the first test. Its value in our 
example is 30/36, or five-sixths. 

. 9. The alternative explanation. The sampling theory. 
—The alternative theory to explain the zero- tetrad- 
differences is that each test calls upon:a sample of the bonds 
which the mind can form, and that some of these bonds are 
common to two tests and cause their correlation. In the 
present instance we have arranged this artificial example 
so that the tests can be looked upon as samples of a very 
simple mind, which can form in all 108 bonds (or some 
multiple of 108).* "The first test uses five-sixths of these 
(or 90), the second test four-sixths (or 72), the third three- 
sixths (54), and the fourth two-sixths (or 36). These 
fractions are the same in value as the communalities of 
the former theory. Each of them may be called the 
“ richness ” of the test. Thus Test 1 is most rich, and 
draws upon five-sixths of the whole mind. The fractions 
Tg'&/r;, which in the former theory were “ communali- 
ties," are in the sampling theory ** coefficients of rich- 


— 2 
Tig 


* There is nothing mysterious about the number 108. It is 
chosen merely because it leads to no fractions in the diagram. 
Any large number would do. 
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Yo 
ness.” They formerly indicated the fraction of me m 
variance supplied by g; they indicate here the an ae 
which each test forms of the whole ** mind ” (but see later, 

rning “ sub-pools ”). 

RSEN if our ue tests a respectively 90, 72, 54, and B 
of the available bonds of the mind, as indicated in P 
34, then there may be almost any kind of overlap ne Me 
two of the tests. Any of the cells of the diagram may ha 


: he 
contents, instead of all being empty except for g and t 


e 

: cept 
specifics, If we know nothing more about the tests excep 
the fractions we have 


called their “ richnesses,” we eanna 
tell with certainty what the contents of each cell wal) x 
but we can calculate what the most probable contents bie 
be. If the first test uses five-sixths and the second E: : 
four-sixths of the mind’s bonds, it is most probable a. 
there will be a number of bonds common to both tes 


is 
» Or 20/36ths of the total number. That is, 
the four cells marked a, 
common to Tests 1 and 2, 


20 
36 X 108 = 60 bonds 
between them, By an extension of the same principle y^ 
can find the most probable number in each cell. Thus t 
the number of bonds used in all four of the tests, is mos 
probably— 
ed. Bea 
6 Dm rip se X 108 = 10 bonds. 
In this Way we reach th 
Overlap of th 


tal to” x 4 
equal to — = 
q 6 6 


Is 
b, c, d in the diagram, the cel 
will most likely contain— 


e most probable pattern a 
€ four tests shown in Figure 35. And i 
diagram gives exactly the same correlations as did F igure 33. 
Let us try, for example, the value of fəs in each diagram. 
In Figure 33 we had— 
T E EDO qe — V12 = -577 
V(45 x 60) 6 


In Figure 35 the same correlation is— 
20 10 
yg = 20+ ge PET ND 
V(72 x 54) 6 
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'This form of overlap, therefore, will give zero tetrad- 
differences, just as the theory of one general factor did. 
More exactly, this sampling theory gives zero tetrad- 
differences as the most probable (though not the certain) 
connexion to be found between correlation coefficients 
(Thomson, 19194) if the sampling of causes is random. 

If we let pi, ps, ps, and p, represent fractions which the 
four tests form of the whole pool of N bonds of the mind, 
then the number common to the first two tests will most 
probably be p,p,N, and the correlation between the tests 


PPN — Ey 
T- = = z z = T 
5o (pN. pN) Tun X ANNE ui Zuo 


We therefore have, in any tetrad, quantities like the 
following : 


| 3 4 
1 | WVpPs VPP 
2 | VPaPs VPPs 


and the tetrad-difference is, most probably (Thomson, 
19274, 253)— 
V/PiPsP2Ps — VP1PsP2Ps = 0 

This may be expressed by saying that the laws of proba- 
bility alone will cause a tendency to zero tetrad-differences 
among correlation coefficients. In another form this 
statement can be worded thus: The laws of probability or 
chance cause any matrix of correlation coefficients to tend 
to have rank 1, or at least to tend to have a low rank (where 
by rank we mean the maximum order among those non- 
vanishing minors which avoid the principal diagonal 
elements). 

It is, in the opinion of the presen 7 
result of the laws of chance and not of any psychological 
laws—which has made conceivable the analysis of mental 
abilities into a few common factors (if not into one only, 
as Spearman hoped) and specifics. Because of the laws 
of chance the mind works as if it were composed of these 
hypothetical factors g, v, n, etc., and a number of specific 
factors. The causes may be “ anarchic,” meaning that 


t writer, this fact—a 
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they are numerous and unconnected, yet the result ^ 
“monarchic,” or at least “ oligarchie," in the sense tha 
it may be so described —provided always that large specific 
factors are allowed, / - 
3. Specific factors maximized.—The specific factors uer , 
in the usual methods of factorization, an important rô à 
and our present example can be used to illustrate the ag : 
which is not usually realized, that all these | metho i 
maximize the Specifics (Thomson, 1938c) by their uiu. 
on minimizing the number of common factors. In F us 
33, of the whole variance of 4, the specific factors con iniba 
» or 41-7 per cent. In Figure 35, they contribute 


Way — 2 1 250 > "cents 
EX 54 ' 86 ^ Logg = 2815, or 5-8 per cen 
90 ^ 72 T 54 36 1,080 


Apart from ci 
in practice, it is generally true tha 


Innumerable other equiv. 
relations can be made, 
the specifies which is les 


: 'al 
in Figure 36 (page 308), is an analysis which has no genres 
factor but six other common factors, and which gives 
total specific variance of— 

15 6 3 
wt mtu tO = 


Now, specifie factors are undoubtedly a difficulty in any 
analysis, and to have the Speci 


ctors are a difficulty seems to be ICCOPR 
nized by Thurstone, « The specifie variance of a test, he 
should be regarded as a challenge, 


to splitting a specific factor up into 
group factors by brigadi i i 


to happen if each analysis is conducted on the principle of 
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making the specific variances as large as possible. We 
must, however, leave this point here, to return to it later. 

4. Sub-pools of the mind.—A difficulty which will occur 
to the reader in connexion with the sampling theory is that, 
when the correlation between two tests is large, it seems to 
imply that each needs nearly the whole mind to perform 
it (Spearman, 1928, 257). In our example the correlation 
between Tests 1 and 2 was -746, a correlation not infre- 
quently reached between actual tests. It is, for instance, 
almost exactly the correlation reported by Alexander 
between the Stanford-Binet test and the Otis Self- 
administering test (Alexander, 1935, Table XVI). Does 
this, then, mean that each of these tests requires the 
activity of about four-sixths or five-sixths of all the 
“ bonds" of the brain? Not necessarily, even on the 
sampling theory. These two tests are not so very unlike 
one another, and may fairly be described as sampling the 
same region of the mind rather than the whole mind, so 
that they may well include a rather large proportion of the 
bonds found in that region. They may be drawn, that is, 
from a sub-pool of the mind’s bonds rather than from the 
whole pool (Thomson, 1935), 91; Bartlett, 1937a, 102). 
Nor need the phrase * region of the mind " necessarily 
mean a topographical region, a part of the mind in the 
same sense as Yorkshire is part of England. It may mean 
Something, by analogy, more like the lowlands of England, 
all the land easily aecessible to everybody, lying below, 
say, the 300-foot contour line. What the “ bonds " of the 
mind are, we do not know. But they are fairly certainly 
associated with the neurones or nerve cells of our brains, 
of which there are probably round about ten thousand 
million in each normal brain. Thinking is accompanied 
by the excitation of these neurones in patterns. The 
Simplest patterns are instinctive, more complex ones 
acquired. Intelligence is possibly associated with the 
number and complexity of the patterns which the brain 
can (or could) make. A “region of the mind” in the 
above paragraph may be the domain of patterns below a 
certain complexity, as the lowlands of England are below 
a certain contour line. Intelligence tests do not call upon 
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4 r these 
are always associated with acquired materia M 2r 
educational environment, and intelligence ae ee 
avoid testing acquirement. It is not difficu ` ae A 
that the items of the Stanford-Binet test l in ind 
Sort of activity nearly all the neurones of the br n S 
they need not thereby be calling upon all t ed is 
which those neurones can form. When E ea aan 
demonstrating to an advanced class that “a d of 
form of rank 2 is identically equal to the pro Ped; 
two linear forms,” he is using patterns of a conn ae 
greater than any used in answering the Binet-Simon Wr 
But the neurones which form these patterns may 1! wA 
more numerous. Those complicated Paetus pos ES 
are forbidden to the intelligence tester, for a vereri 7 Erm 
man may not have the ghost of an idea what a " qua € 
form" is. Within the limits of the comparatively ae 
patterns of the brain which they evoke, it seems larje 
possible that the two tests in question call upon UC 
proportion of these, and have a large number incon that 

As has been indicated, the author is of opinion Dm 
the way in which they magnify specifie factors P 
weak side of the theories of a few common factors. 4x of 
does not mean, however, that a description of a matr p 
correlations in terms of these theories is inexact. dod 
undoubtedly do perform mental tasks as if they were ub 
so by means of a comparatively small number of gro fic 
factors of wide extent, and an enormous number of speci h 
factors of very narrow range but of great importance ae 
within its range, Whether a description of their powers 4 
terms of the few common factors only is a good lesen 
depends in large measure on what purpose we want P 
The practical purpose is usually 
cational advice to the man or to 

» and factors, though they ae 
improve and indeed may blur the accuracy of vocationa 
however, facilitate them where otherwise 
they would have been impossible, as money facilitates 
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the theories which use the smallest number of common 
factors seem to have drawbacks. They can give an exact 
reproduction of the correlation coefficients. But, because 
of their large specific factors, they do not enable us to give 
an exact reproduction of each man’s scores in the original 
tests, so that much information is being lost by their 
use. 

It will be seen from considerations such as these that 
alternative analyses of a matrix of correlations, even 
although they may each reproduce the correlation coeffi- 
cients exactly, may not be equally acceptable on other 
grounds. The sampling theory, and the single general 
factor theory, can both describe exactly a hierarchical set 
of correlation coefficients, and they both give an explana- 
tion of why approximately hierarchical sets are found in 
practice. In a mathematical sense, they are alternatives. 
But we cannot keep both as realities, though we may 
employ either mathematically. 

5. The inequality of men.—Professor Spearman opposed 
the sampling theory chiefly on the ground that it would 
make all correlations equal (and zero), and involve the 
further consequence that all men are equal in their average 
attainments (Abilities, 96), if the number of elementary 
bonds is large, as the sampling theory requires. Both 
these objections, however, arise from a misunderstanding 
of the sampling theory, in which a sample means *' some 
but not all” of the elementary bonds (Thomson, 1935b, 
72, 76). As has been explained, tests can differ, on this 
theory, in their richness or complexity, and less rich tests 
Will tend to have low, more complex tests will tend to have 
high correlations, at any rate if the “ bonds ” tend to be 


all-or-none in their nature, as the action of neurones 1s 


known to be. And as for the assertion that the theory 
makes all men equal, there is no basis whatever for the 
suggestion that it assumes every man to have an equal 
chance of possessing every element or bond. On the con- 
trary, the sampling theory would consider men also to be 
samples, each man possessing some, but not all, both of the 
inherited and the acquired neural bonds which are the 
physical side of thought. Like the tests, some men are 
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rich, others poor, in these bonds. Some are richly endo 
by heredity, some by opportunity and education A. 
by both, some by neither. The idea that men E S WR 
of all that might be, and that any task samples the PM 
which an individual man possesses, does not for a E. 
carry with it the consequences asserted of equal correla 
and a humdrum mediocrity among human kind. Lee 
6. Negative and positive correlations.*—The iden NL. 
ity of correlation coefficients reported in both RT. 
and psychological work are positive. This e har. 
Tepresents an actual fact, namely that desirable doa 
in mankind tend to be positively correlated ; for Me 
reported correlations may be selected by the NAM 
prejudices of experimenters, who are usually on ud SEM 
out for things which correlate positively, yet as thos LUN 
have tried know, it is really very difficult to a 
negative correlations between mental tests. Bde with 
in imagination we cannot make a race of bores iste 
predominantly negative correlations. A number 9 like 
of the same persons in order of merit can be all very it 
one another, can indeed all be identical, but they LS 
all be the opposite of one another, If Lists a and b kr 
the inverse of one another, List c, if it is De 
Will be positively correlated with us 
of variates, it is logically possible 1 
of correlation coefficients each ent 
an average correlation of unity. Wd 
erage correlation can be pushed in 3 
negative direction is — V/(n 1) That is, if n is large: 
the average correlation can range from + 1 to only | 
little below zero, Even Mother Nature, then, by natura 
selection or by any other means, could not endow man 
with abilities which Showed both many and large negative 


correlations, If they were many, they would have to d 
very small; if they were large, they would have: to 
very few. 


S to correlations between tests. The greater 
frequency of negative i between persons has already been 
discussed in Chapter XVI, Section 


` 
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favour positive correlations within the species.* In the case 
of some physical organs it is obvious that a high positive 
correlation is essential to survival value—for example, 
between right and left leg, or between legs and arms. In 
these cases of actual paired organs, however, it is doubtless 
more than a mere figure of speech to speak of a common 
factor as the cause. Between organs not simply related 
to one another, as say eyes and nose, natural selection, 
if it tended towards negative correlation, would probably 
split the genus or species into two, one relying mainly on 
eyesight, the other mainly on smell. Within the one 
species, since it is mathematically easier to make positive 
than negative correlations, it seems likely that the former 
would largely predominate. To say that this was due to 
a general factor would be to hypostatize a very complex 
and abstract cause. To use a general factor in giving a 
description of these variates is legitimate enough, but is, 
of course, nothing more than another way of saying that 
the correlations are mainly positive—if, as is the case, most 


* An important kind of natural selection is the selection of one sex 
by the other in mating. Dr. Bronson Price (1936) has pointed out 
that positive cross-correlation in parents will produce positive correla- 
tion in the offspring. Price further shows that this positive cross- 
correlation in the parents will result if the mating is highly homo- 
Eamous for total or average goodness in the traits, a conclusion which, 
it may be remarked here, can be easily seen by using the pooling 
Square described in our Chapter XIV. Price concludes: “ The 
Intercorrelations which g has been presumed to illumine are seen 
primarily as consequences of the social and therefore marital 
Importance which has attached to the abilities concerned.” Price 
in his argument makes use of formule from Sewall Wright (1921). 
M. S. Bartlett, in a note on Price’s paper (Bartlett, 19370), develops 
his argument more generally, also using Wright's formule, and says: 
‘Price contrasts the idea of elementary genetic components with 
factor theories. . . . It should, however, be pointed out that a 
Statistical interpretation of such current theories can be and has been 
advocated. Thomson has, for example, shown ", and here 
follows a brief outline of the sampling theory. “On the basis of 
Thomson’s theory,” Bartlett adds, * [ have pointed out (Bartlett, 
1937a) that general and specific abilities may naturally be defined 
in terms of these components, and that while some statistical 
Interpretation of these major factors seems almost inevitable, this 
may not in itself render their conception invalid or useless. 


/ we 
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people mean by a general factor one LM. » S 
ease, not an interference factor which sometimes he p: 
i inders. 

E rank.—It is, however, on the tenda 
to a low reduced rank in matrices of mental PA. : 
that the theory of factors is mainly built. It ie Mer 
much impressed people to find that mental corre » E 
can be so closely imitated by a fairly small num x. 
common factors. Ignoring the host of large specific E a 
to which this view commits them, they have Wee x 
that the agreement was so remarkable that there del af 
something in it. There is; but it is almost the oppos, um 
what they think. Instead of showing that the min E 
2 definite structure, being composed of a few factors w b 
work through innumerable specific machines, the low A a 
Shows that the mind has hardly any structure. If ho 
early belief that the reduced rank was in all cases one i 
been confirmed, that would indeed have shown that A 
mind had no structure at all but was completely bor 
entiated. It is the departures from rank 1 which indica 
structure, and it is a significant fact that a general tendeni 
is noticeable in experimental reports to the effect a f 
batteries do not permit of being explained by as sma B 
number of factors in adults as in children, probably Deed 
in adults education and vocation have imposed a structur 
on the mind which is absent in the young. ; 

By saying that the mind has little structure, nothing 
derogatory is meant. The mind of man, and his brain, too, 
are marvellous and wonderful. A] that is meant by the 
absence of structure is the absence of any fixed or strong 
linkages among the elements (if the word may for a moment 
be used without implications) of the mind, so that any 
sample whatever of those elements or components can be 
assembled in the activity called for by a “ test.” : 

Not that there is any necessity to suppose that the mind 
is composed of Separate and atomic elements. It is pos- 
sibly a continuum, its elements if any being more like 
the molecules of a dissolved crystalline substance than 
like grains of sand. The only reason for using the word 
“ elements ” is that it is difficult, if not impossible, to speak 


. THE SAMPLING THEORY 319 


of the different parts of the mind without assuming some 
“items ” in terms of which to think. For concreteness it 
is convenient to identify the elements, on the mental side, 
with something of the nature of Thorndike's ** bonds," 
and on the bodily side with neurone arcs ; in the remainder 
of this chapter the word “ bonds " will be used. But 
there is no necessity beyond that of convenience and 
vividness in this. The “bonds” spoken of may be 
identified by different readers with different entities. All 
a * bond " means, is some very simple aspect of the causal 
background. Some of them may be inherited, some may 
be due to education. There is no implication that the 
combined action of a number of them is the mere sum of 
their separate actions. There is no commitment to 
“mental atomism.” 

If, now, we have a causal background comprising in- 
numerable bonds, and if any measurement we make can 
be influenced by any sample of that background, one 
Measurement by this sample and another by that, all 
samples being possible; and if we choose a number of 
different measurements and find their intercorrelations, 
the matrix of these intercorrelations will tend to be 
hierarchical, or at least tend to have a low reduced rank. 
This has nothing to do with the mind: it is simply a 
mathematical necessity, whatever the material used to 
illustrate it. 

8. A mind with only six bonds.—We shall illustrate this 
fact first by imagining a “ mind” which can form only 
six “bonds,” which mind we submit to four “ tests "n 
Which are of different degrees of richness, the one requiring 
the joint action of five bonds, the others of four, three, and 
two respectively (Thomson, 1927b). These four tests will 
(when we give them to a number of such minds) yield 
Correlations with one another. For we shall suppose the 
different minds not all to be able to form all six of the 
Possible bonds, some individuals possessing all six, others 
Possessing smaller numbers. 

We have only specified the ric 
have not said which bonds form each a 
therefore, be different degrees of over 


hness of each test, but 
bility. There may, 
lap between them, 
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though some will be more frequent than others if we form 
all the possible sets of four tests which are of richness five, 
four, three, and two. If we call the bonds a, b, c, d, e, 


and f, then one possible pattern of overlap would be the 
following : 7 


Test Bonds 
1 a b c d e 
2 b c d e á 
3 : c d d Oa sah 
A : ; c d à 


If we for further simplicity suppose these bonds to be 
equally important, and use the formula— 
overlap x 
geometrical mean of the two totals 


we can calculate the correlations which these four tests 
would give, namely : 


Correlation = 


1 2 3 4 
Linehan es 
20 4/15 4/10 
nee E 
/20 A2 4/8 
ARE 2 1 
MAS 4/12 V6 
"arcc ee T 
VvV10 4/8 y6 


and we notice that in this 
tetrad-differenees are Zero. 
four tests at random 


particular pattern all three 
However, if we picked our 
(taking care only that they were of 
these degrees of richness) we would not always or often get 
the above pattern: in point of fact, we would get it only 
12 times in 450. Nevertheless, it is one of the most prob- 
able patterns. In all, 78 different patterns of the bonds 
are possible—always adhering to our five, four, three, and 


two—the probability of each pattern ranging from 12 in 
450 down to 1 in 450. 
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It is possible to calculate the tetrad-differences for each 
one of the 78 possible patterns of overlap which can occur. 
When we then multiply each pattern by the expected fre- 
quency of its occurrence in 450 random choices of the four 
tests, we get 450 values for each tetrad-difference, distri- 
buted as follows : 

Values of | Frequency of 


F x 4120| F, er ae o ins 
| 2 

FE 2 

7 2 KO 

CA | | 8 | 14 

5 9: ental G 

4 ov | 34 | 23 

3 6 | 12 | 80 

2 75 | 72 | 48 

1 61 | 66 | 72 

0 99 | 54 | 8l 

zu 56 | 78 | 386 

—2 67 | 42 | 42 

x i16 | 30 | 60 

d 30 | 36 | 18 

LG 0 0 0 

—6 A 909 IS 

450 | 450 | 450 


htly 


Although the distribution of each F about zero is slig 
For 


irre " | 
p gular, the average value of each F is exactly zero. 
1 the variance is— 


of very primitive- 


We see, then, that in this universe 
ly six bonds, four 


ee whose brains can form on s 
dus bo ich demanded respectively five, four; three, an 
val onds would give tetrad-differences whose expected 
alue would be zero, the values actually found being 
Srouped around zero with a certain variance. There is no 
Poa mystery about the four *' richnesses ” five, four, 
ree, and two, by the way. We might have taken any 
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Let the non-zero roots form the diagonal matrix D. Then 
the principal axes analyses are : 

W = H,D!F,, dimensions (t . r)(r . v)(r . p) 
and W'— H,D'F,, dimensions (p . v)(r T) t) 
where H, and H, are the latent vectors of WW’ and W'W, 
while F, is the matrix of factors possessed by persons, 


F, that of factors possessed by traits. From the analysis 
of W we have, taking the transpose— 


W'— F,'D!H,, dimensions (p.ryr.ryr.1) 
and comparison of this with the former expression for W’ 
makes the reciprocity of H, and Fy’, F, and H’, evident. 
19. Oblique factors. Structure and pattern.—In Thur- 
stone's notation, which we shall follow in this paragraph, 
the matrix M of our equation (3), when it refers to centroid 
factors, is called P. Our equation (3) becomes in his 
notation— 
s=F p 
Since centroid factors are orthogonal, F is both a pattern 
and a structure. The Structure is the matrix of correla- 
tions between tests and factors, i.e. : 
Structure = sp’ — (Fp)p’ = F( 
When the factors are obli 
case. In that case, 
correlations between th 
Thurstone turns the 


Dp') = FI = F = Pattern. 
que, however, this is not the 
Structure = Pattern x matrix of 
e factors. 

In ; centroid factors to a new set of 
positions (still within the common-factor space, and in 


general oblique to one another) called reference vectors. 
The rotating matrix is A, and 


V=FA i (63) 


is the structure on the 
the angles between the reference vectors are 
Vis not a pattern. Its rows cannot be used 
in equations specifying a man’s scores in th 
his scores in the reference vectors, 
reference vectors would 
found in V. 

The primary factors are the lines 
hyperplanes which are at right an 


reference vectors. The cosines of 


given by A'A. 
as coefficients 
e tests, given 
The pattern on the 


of intersection of the 
gles to the reference 
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vectors, taken (r — 1) at a time where r is the number of 
common factors, the number of dimensions in the common- 
factor space. They are defined, therefore, by the equations 
of the hyperplanes, taken (r — 1) at a time. These 


equations are N =O (64) 


where æ is a column vector of co-ordinates along the 
centroid axes. The direction cosines of the intersections 
of these hyperplanes taken (r — 1) at a time are therefore 
proportional to the elements in the columns OBAT 
and to make them into direction cosines this has to have 
its columns normalized by post-multiplication by a diagonal 
matrix D, giving for the structure on the primary factors 
F(A')-'D v qs Se NOS) 
D is also the matrix of correlations between the reference 
veetors and the primary factors, for 
N/A =D s z . (66) 
Each primary factor is therefore correlated with its own 
reference vector but orthogonal to all the others, as can 
also be easily seen geometrically. 
_ The matrix of intercorrelations of the primary factors 
is DA-(A^-!D from equation (65). 
If W is the pattern on the primary factors p, so that 
test scores s = Wp 
then the structure on the primary factors is also 
sp’ = Wpp' . 
where pp' is the matrix of correlations between the primary 
factors, and therefore 
Primary factor structure = WDA-(M)^D . - (67) 


Also, this structure = F(A’)"'D from (63). 
Equating these we have : 
WDA =F 
whence W = FAD™ . s xs 
= VD“! . 3 x69) 


We have, therefore, 

Structure Pattern 
Reference vectors . E FA al (70) 
Primary factors a : F(A) D FAD" 
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* yiehnesses " and got a similar result. If there are no 
linkages among the bonds, the most probable value of a 
tetrad-difference will always be zero; and if all possible 
combinations of the bonds are taken, the average of all the 
tetrad-differences will be zero. With only six bonds in the 
** mind," however, the scatter on both sides of zero will be 
considerable, as the above value of the standard deviation 
of F, shows, viz.— 
c = 4/040 = -20 


9. A mind with twelve bonds.—But as the number of 
bonds in the mind increases, the tetrad-differences crowd 
closer and closer to zero. Let us, for example, suppose 
exactly the same experiment as above conducted in a 
universe of men whose minds could form twelve bonds 
(instead of six), the four tests requiring ten, eight, six, and 
four of these (instead of five, four, three, and two) (Thom- 
son, 19270). This increase in complexity enormously 
inereases the work of calculating all the possible patterns 
of overlap, and the frequency of each. There are now 
1,257 different square tables of correlation coefficients and 
still more patterns of overlap, some of which, however, 
give the same correlations. When each possibility is taken 
in its proper relative frequency (ranging from once to 
11,520 times) there are no fewer than 1,078,110 instances 
required to represent the distribution. They have, 


nevertheless, all been calculated, and the distribution of 
F; was as follows : 


V1920 | 1920 | visas | | 1926] 
F, 4 Freq. | ET Freq. | : Mi Freq. | M 5 Freq. 
20 225 | 7 | 17,760 | —3 |31,432 | —18 24. 
18 | 1,800) 6 | 74,3392 | — 4 |72,676 | — 14 | 8,792 
16 1,755 | 5 15,744 | — 5 |53,808 | — 15 | 4,144 
15 4,600 | 4 52,085 | — 6 |49,328 | — 16 | 3,970 
14 3,840 | 8  |1e1,008 | — 7 |21240 | — 18 112 
12 |19,010| 2 | 42384| — 8 |41,951 | — 19 456 
11 | 10,632 1 28,096 | —9 | 5,896 | — 20 584 
10 | 8,360 | 0 |122,609 | —10 |29184| —24| 28 
9 [26,00|—1 | 63,024 || —11 | 8,960 | | 
8 [837,785 |—?2 | 81,208 | — 12 |15,672 | 


Total 1,078,110 
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This table again gives an average value of F, exactly 
equal to zero. But the separate values of the tetrad- 
difference are grouped more closely round zero than 
before, with a variance now given by— 

37,166,400 
1,920 x 1,078,110 

This is rather less than half the previous variance. 
Doubling the number of bonds in the imagined mind has 
halved the variance of the tetrad-differences. If we were 
to increase the number of potential bonds supposed to 
exist in the mind to anything like what must be its true 
figure, we would clearly reach a point where the tetrad- 
differences would be grouped round zero very closely 
indeed. 

The principle illustrated by the above concrete example 
can be examined by general algebraic means, and the above 
Suggested conclusion fully confirmed (Mackie, 1928a, 
1929). Tt is found that the variance of the tetrad-differ- 
ences sinks in proportion to 1/(N — 1), where N is the 
number of bonds, when N becomes large, and the above 


ee agrees with this even for such small N’s as 6 and 
> for— 


c? = 0-018 


12—1 x -040 = :018 as found. 
In this mathematical treatment, bonds have been spoken 
of as though they were separate atoms of the mind, and, 
moreover, were all equally important. It is probably 
quite unnecessary to make the former assumption, which 
may or may not agree with the actual facts of the mind, 
Or of the brain, Suitable mathematical treatment could 
ely be devised to examine the case where the causal 
eckground is, as it were, a continuum, different proportions 
E = forming tests of different degrees of richness. And as 
for the second assumption, it is in all likelihood merely 
i mal. Let the continuum be divided into parts of equal 
Mportance, and then the number of these increased and 
at extent reduced, keeping their importance equal. 

s at is necessary, to give the result that zero tetrads are 
0 highly probable, is that it be possible to take our tests 


324 THE FACTORIAL ANALYSIS OF HUMAN ABILITY 


with equal ease from any part of the causal background ; that 
there be no linkages among the bonds which will disturb the 
random frequency of the various possible combinations ; 
in other words, that there be no “ faculties’? in the mind. 
And it is also necessary that all possible tests be taken in 
their probable frequency. 

In any actual experiment, of course, it is quite imprac- 
ticable to take all possible tests, which are indeed infinite 
in number. A sample of tests is taken. If this sample 
is large and random, then there should, in a mind without 
separate “ faculties,” without linkages between its bonds, 
be an approach to zero tetrads. The fact that this ten- 
dency attracted Professor Spearman’s attention, and was 
sufficiently strong to make him at first believe that all 
samples of tests showed it, provided care was taken to 
avoid tests so alike as to be almost duplicates (which 
would be “ statistical impossibilities ” in a random sample), 
indicates that the mind is indeed very free to use its bonds 
in any combination, that they are comparatively unlinked. 

The sampling theory assumes that each ability is com- 
posed of some but not all of the bonds, and that abilities 
can differ very markedly in their “ richness,” some needing 
very many “ bonds," some only few. It further requires 
some approach to “ all-or-none " reaction in the “ bonds ” ; 
that is, it supposes that a bond tends either not to come 
into the pattern at all, or to do so with its full force. This 
does not seem a very unnatural assumption to make. It 
would be fulfilled if a “ bond ” had a threshold below which 
it did not act, but above which it did act ; and this property 
is said to characterize neurone arcs and patterns. When 


this form of sampling is assumed the rank of the correlation 
matrix tends to be reducible to a small number, if all 


possible correlations are taken, and finally to be one as the 
bonds increase without limit. 

It is important to realize wha 
tending to rank 1 as more and 
lations are taken. When the rank is 1 the tetrad- 
differences are zero. But clearly, the reader may say, 
taking more and more samples of the bonds to form more 
and more tests will not change in any way the pre-existing 


t is meant by the rank 
more of the possible corre- 
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tetrad-differ : 
zero to ADR will not make them zero if they are not 
S abis i eie y That is perfectly true ; but that is not 
samples of xw s more and more tests are formed by 
ictrads will i onds, the number of zero and very small 
sampling eser and swamp the large tetrads. The 
ona y does not say that all tetrads will be 
tetrads will z ae rank exactly 1. It says that the 
is Sian both cl istributed about zero (not because each 
sign by the os and minus, but when all are given their 
Eid ab Int ne rule) with a scatter which can be reduced 
maia in the sense that with more bonds the pro- 
always provided tetrads becomes smaller and smaller ; 
the family 2 ed all possible samples are taken, i.e. th t 
With a UHR EE coefficients is PU, TE 
case: but if E number of tests this, of course, is not the 
tests, (i cM e tests are a random sample of all possible 
The iM ae. n again be the approach to zero tetrads 
mind, but so e true if the tests are sampling not the ROS 
abilities, mo portion of it, some sub-pool of our mind's 
Waters, we sl E stray from this pool and fish in other 
the whole po sie break the hierarchy ; but if we sampled 
to W A a mind, we should again find the tendency 
pools (such 3 order. If the mind is organized into sub- 
liable to fish Y. the verbal sub-pool, say), then we shall be 
2 or Bin our in two or three of them, and get a rank of 
matrix, i.e. get two or three common factors, 


in th 
10. ENIM of the other theory. 
for tetra pei with physical measurements. 
appears to Hs erences to be closely groupe 
Where; stri stronger in mental measurements than else- 
onger, for example; than in physical measure- , 


— The tendency 


ments ‘ 
In cee. it is found there too- 

body just r measurements we do not measure à person's 

om anywhere to anywhere. We observe organs 

irth, ete. e 


chest giv 


and 
m 
easure them—leg, cranium, 
In other 


Variat 

Physical T not a random sample. 

td has an obvious structur 

Correlation ad and the tendency to ? i 

mental m coefficient, although present, 
easurements. The tendency 


ow r9 
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differences in the mind is due to the fact that the mind 
has, comparatively speaking, no organs. We can, and do, 
measure it almost from anywhere to anywhere. No test 
measures a leg or an arm of the mind; every test calls 
upon a group of the mind’s bonds which intermingles in 
most complicated ways with the groups needed for other 
tests, without being a set pattern immutably linked into 
an organ. Of all the conceivable combinations of the 
bonds of the mind we can, without great difficulty, take a 
random sample, whereas in physical measurements we take 
only the sample forced on us by the organs of the body. 
Being free to measure the mind almost from anywhere to 
anywhere, we can get a set of measurements which show 
“hierarchical order ” without overgreat trouble. We can 
do so because the mind is so comparatively structureless. 
Mental measurements tend to show hierarchical order, and 
to be susceptible of mathematical description in terms of 
one general factor or few, and innumerable specifies, not 
because there are specific neural machines through which 
its energy must show itself, but just exactly because there 
are no fixed neural machines. The mind is capable of 
expressing itself in the most plastic and Protean way, 
especially before education, language, the subjects of 
the school curriculum, the occupation, and the political 
beliefs of adult life have imposed a habitual structure on 
it. It is not without significance that the “ factor ” most 
widely recognized after Spearman's g is the verbal factor v, 


the mother-tongue being, as it were, the physical body of 
the mind, its acquired structure, 
11. Absolute variance of d 


1 ifferent tests.—]It will be noted 
that on the sampling 


] theory the different tests will natur- 
ally have different variances, the * richer.” tests h 


wider scatter. This seems only natural. 
at any rate in theoretical discussions, 
in different tests to standard meas 
izing their variance. This seems i 


is no means of comparing the scatter of marks in two 
different tests. But it does not follow that the scatter 
would be really the same if some means of comparison 
were available. When the same test is given to two 


aving a 
It is customary, 
to reduce all scores 
ure, thereby equal- 
nevitable, for there 
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different groups we have no hesitation in ascribing a wider 
variance to the one or the other group, and it seems con- 
ceivable that a similar distinction might mentally be made 
between the scores made by one group in two different 
tests. The writer is completely in accord with M. S. Bart- 
lett when he says (Bartlett, 1935, 205): “I think many 
people would agree . . . that the variation in mathematical 
ability displayed even in a selected group such as Cam- 
bridge Tripos candidates cannot be altogether put down 
to the method of marking adopted by the examiners." 
We may put these mathematies marks into standard 
measure, and we may put the marks scored by the same 
group in, say, a form-board test, also into standard measure. 
But that does not imply that at bottom the two variances 
are equal, if only we had some rigorous way of comparing 
them. Our common sense tells us plainly that they are 
not equal in the absolute sense, though for many purposes 
their difference is irrelevant. It seems to be no defect, 
then, but rather a good quality, of the sampling theory 
to involve different absolute variances. 
- 12. A distinction between g and other common factors.— 
The writer is inclined to make a distinction in interpretation 
between the Spearman general factor g and the various 
other common factors, mostly if not all of less extent than 
g, which have been suggested. When properly measured 
by a wide and varied hierarchical battery, & appears to him 
to be an index of the span of the whole mind, other common 
factors to measure only sub-pools, linkages among bonds. 
The former measures the whole number of bonds; the 
latter indicate the degree of structure among them. 

Some of this ** structure " is no doubt innate; but more 
of it is probably due to environment and education and 
life. Tts expression in terms of separate uncorrelated 


factors suggests what is almost certainly not the case, that 
the **sub-pools" are separate from one another. The 
actual organization is likely to be much more complicated 
than that, and its categories to be interlaced and inter- 
Woven, like the relationships of men in a community, 
plumbers and Methodists, blonds, bachelors, smokers, 
Conservatives, illiterates, native-born, criminals, and 
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school-teachers, an organization into classes which cut 
across one another right and left. 

Further, it is improbable that the organization of each 

mind is the same. The phrase “ factors of the mind " 
suggests too strongly that this is so, and that minds differ 
only in the amount of each factor they possess. Itis more 
than likely that different minds perform any task or test by 
different means, and indeed that the same mind does so at 
different times. 
Yet with all the dangers and imperfections which attend 
it, it is probable that the factor theory will go on, and will 
serve to advance the science of psychology. For one thing, 
it is far too interesting to cease to have students and 
adherents. There is a strong natural desire in mankind 
to imagine or create, and to name, forces and powers 
behind the facade of what is observed, nor can any excep- 
tion be taken to this if the hypotheses which emerge 
explain the phenomena as far as they go, and are a guide 
to further inquiry. That the factor theory has been a 
guide and a spur to many investigators cannot be denied, 
and it is probably here that it finds its chief justification, 


CHAPTER XXI 
SOME FUNDAMENTAL QUESTIONS 


Ir seems advisable to conclude with a brief discussion of 
some of the fundamental theoretical questions needing an 
answer. Among these are the following, of which (1) 
and (8) are rather liable to be forgotten by those actually 
engaged in making factorial analyses : 

(1) What metric or system of units is to be used in 
factorial analysis ? 

(2).On what principle are we to decide where to stop the 
rotation of our factor-axes or how to choose them so that 
rotation is unnecessary ? 

(3) Is the principle of minimizing the number of 
common factors, ie. of analysing only the communal 
Variance, to be retained ? 

(4) Are oblique, i.e. correlated factors to be permitted ? 

1. Metric.—Most of the work done in factorial analysis 
has assumed the scores of the tests to be standardized ; 
that is to say, in each test the unit of measure has been 
the actual standard deviation found in the distribution. 

his is in a sense a confession of ignorance. The accidental 
Standard deviation which happens to result from the par- 
ticular form of scoring used in a test means, of course, 
nothing more. Yet there is undoubtedly something to be 
said for the probability of real differences of standard 
deviation existing between tests (see Chapter XX, 
Section 11) In that case, if we knew these real standard 
deviations, we would use variances and covariances and 
analyse them, not correlations (compare Hotelling, 
421-2 and 509-10). 

Burt has urged the use of variances and covariances, 

Which are indeed necessary to him to enable his relation 
tween trait factors and person factors to hold (see Chap- 
ler XVII, page 264). But the variances and covariances 
e actually uses are simply the arbitrary ones which arise 
F.A—]]* 329 


1933, 
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from the raw scores, and depend entirely upon the scoring 
system used in each test. It would seem necessary to 
have some system of rational, not arbitrary, units. 

Hotelling has already suggested one such, based upon 
the idea of the principal components of all possible tests, 
but it would seem to be unattainable in practice (Hotel- 
ling, 1933, 510). Another can be based on the ideas of the 
sampling theory and has already been foreshadowed in 
the previous chapter. "Tests quite naturally have different 
variances on that theory, since they comprise larger or 
smaller samples of the ** bonds ” of the mind (see Thomson, 
1935b, 87). In a hierarchical battery these natural 
variances are measured by the “ coefficient of richness." 
The “ richness ” of Test X is given by 

Put " 
Tg ' 

the same quantity as the square of Spearman’s “ satura- 
tion with g.” It is, on the sampling theory, the fraction 
which the test forms of the pool of bonds which is being 
sampled, and is the natural variance of the test in compari- 
son with other tests from that pool. The “ saturation 
with g” of Spearman’s theory is the “ natural standard 
deviation ” of the sampling theory. Even in a battery 
which is not hierarchical, the formula (Chapter III, 


Section 5, page 43)— 
a A’ 
T — 24 


will give a rough estimate of the natural standard deviation 
of each test. The general pr: 


inciple is that tests which 

show the most total correlation have the largest natural 
variance. 

2. Rotation.—Our view 

depend on what we wan 

them as merely a conve 


s on the rotation of factors will 
t them to do. Burt looks upon 
enient form of classification and is 
pal axes of the ellipsoids of density; 


9 them given by a good centroid 
analysis, as they stand, without’ any rotation. He “ takes 


out" the first centroid factor, either by calculation or 
by selecting a very special group of persons each of whom 
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has in a battery of tests an average score equal to the 
population average, cach of the tests also having the same 
average as every other test in the battery over this sub- 
group of persons (Burt, 1938a). He concentrates attention 
on the remaining factors, which are “ bipolar,” having 
both positive and negative weights in the tests. When, 
as in the article referred to, he is analysing temperaments, 
this fits in well with common names for emotional charac- 
teristies, for those names too are usually bipolar, as 
brave-cowardly, extravagant-stingy, extravert-introvert, 
and so on. 

Thurstone, on the other hand, emphatically insists on 
the need for rotation if the factors are to have psycho- 
logical meaning (Thurstone, 1938a, 90). The centroid 
factors are mere averages of the tests which happen to 
form the battery, and change as tests are added or taken 
away, whereas he wants factors which are invariant from 
battery to battery. I think he would put invariance 
before psychological meaning, and say that if a certain 
factor keeps turning up in battery after battery we must 
ask ourselves what its psychological meaning is. His 
own opinion, backed up by a great deal of experimental 
Work of a pioneering and exploratory nature, is that his 
principle of rotating to “ simple structure " gives us also 
psychologically meaningful and invariant factors. 

The problems of rotation and metric are not unconnected, 
and one piece of evidence in favour of rotating to simple 
structure is that the latter is independent of the units 
used in the tests. If instead of analysing correlations we 
analyse covariances, with whatever standard deviations 
We care to assign to the tests, we get a centroid analysis 
quite different from the centroid analysis of correlations. 
But if we rotate each to simple structure the tables are 
identical, except, of course, that in the covariance structure 
s row is multiplied by the standard deviation of the 
est. d 

For example, if we take the six tests of Chapter XI 
Section 2 (page 152) and ascribe arbitrary standard 
deviations of 1, 2, 3, 4, 5, and 6 to them, we can replace the 
correlations and communalities by covariances and vari- 
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ance-communalities, and perform a centroid analysis. 
Since we know the proper communalities* it comes out 


exactly in three factors with no residues, and gives the 
centroid structure : 


| I II III 


1 | B72 -567 462 
2 948 1:278 — -060 
3 1:969 —1:016 — -337 
4 1:002 1072 —2-118 
5 2-992 -593 1-716 
6 9:879  — 92-493 :387 


When this is rotated to simple structure, by post- 
multiplication by the matrix 


:802 :389 453 
—:592 A416 *691 
‘080 —-822 “564 


the resulting table is : 


| A B C 
1 . : :820 
2 : :950 1:278 
3 2-154 619 
4 . 2:577 E 
5 2:187 4 9-782 
6 4:213 


This is identical with the simple structure found from 
the correlations, if the rows here are divided by 1, 2, 3, 45 
5, and 6, the standard deviations. It is definitely a point 
in favour of simple structure that it is thus independent 
of the system of units employed. Spearman’s analysis of 
a hierarchical matrix into one g and specifics also has this 

* If we have to guess communalities, o i j 

^ L a ; Our two simple structures 
will differ slightly because the highest covariance in R column may 
at ae ae p highest correlation. But with a battery of 

any tes is difference will b i 
annulled by iteration. AS PNG TOME 
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property of independence of the metric. If the tetrad- 
differences of a matrix of correlations are zero, and we 
analyse into one general factor and specifics, it is immaterial 
whether we analyse correlations or covariances. The 
loadings obtained in the latter case are exactly the same 
except, of course, that cach is multiplied by the appropriate 
standard deviation. ; i 

. At this point one is reminded of Lawley’s loadings* 
found by the method of maximum likelihood, for these 
possess the property that the unrotated loadings obtained 
from correlations are already the same as the unrotated 
loadings obtained from covariances, if the latter are 
divided by the standard deviations. Centroid analyses, 
or principal component analyses, do not possess this 
property. The loadings obtained by these means from 
covariances cannot be simply divided by the standard 
deviations to give the loadings derived from correlations, 
though the one can be rotated into the other. Lawley’s 
loadings need no such rotation. They are, as it were, at 
Once of the same shape whether from covariances or from 
correlations and only need an adjustment of units, such as 
one makes in changing, say, from yards to feet. A field 
which is 50 yards broad and 20 poles long has the same 

shape as one which is 150 feet broad and 330 feet long. 
Now, as we have seen, this property of equivalence of 
Covariance and correlation loadings is also possessed by 
Simple structure. It would thus not be unnatural to hope 
that Lawley’s method might lead straight to simple 
Structure, without any rotation. But this is not the case. 
ur , then, simple structure is not the only position of 
axes where the loadings are independent of the units of 
measurement employed. Indeed, any subsequent post- 
ea accordance with our definition on page 170, the term x ee 
« Means a coefficient in a specification equation, an entry in a 
jaa In the present chapter it is used UN ne A 
dies S when the axes referred to ae isl VE EY. 
Hopes oblique, then much of what is said reall MeO ‘ill 4 
cture, not in a pattern: but the word * loading " 15 SU. use! 


0 ava; : 
pco circumlocutions, and because the struc 
pat ors is, except for a diagonal matrix multiplier, 

tern of the AS 


identical with the 
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multiplieation of both the simple structure tables—both 
that from correlations and that from covarianees—by the 
same orthogonal rotating matrix will leave their equivalence 
with regard to units unharmed. Simple structure is only 
one of an infinite number of positions which possess this 
property. But it is an easily identifiable one. 
It is difficult to keep one's mind clear as to the meaning 
of this. Let me recapitulate. There are some processes 
. of analysis which, while they give a perfect analysis in the 
sense of one which reproduces the correlations (or the co- 
variances) exactly, do not give the same analysis for the 
correlations as for the covariances. The factors they 
arrive at depend upon the units of measurement employed 
in the tests. Such, for example, are the principal compon- 
ents process and the centroid process. Such processes 
cannot be relied on to give straight away and without 
rotation, factors which can be called objective and scien- 
tific. Some processes, on the other hand, do give analyses 
which are independent of the units. One such is Lawley’s, 
based on maximum likelihood. Another is Thurstone’s 
simple-structure process, which, though it begins by using 
a centroid analysis, follows this by rotation of a certain kind. 
But the principle of independence of units does not 
distinguish between these processes, which both satisfy it. 
Still less does it distinguish between systems of factors. 
For any one of the infinite number of such systems which 
can be got from either simple structure or L 
by rotation equally satisfies the 
can really be no talk of a s 


awley's factors 
principle. Indeed, there 
n re ystem of factors satisfying the 
prineiple. Any table of loadings whatever, obtained from 
correlations, has, of Course, corresponding to it a system 
differing only in that the rows are multiplied by coefficients, 
a system which would correspond with covariances. 
The fact that no one has discovered a process which gives 
both is irrelevant. The argument is rather as follows. If 
a worker believes that he has found a process which gives 
the true psychological factors, then that process must be 
independent of the metric, and simple structure and 
maximum likelihood are both thus independent, though 
they do not, alas, agree. Nor must it be forgotten that 
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are from correlations are in no way superior to those 
icd ERE jose Indeed, correlations are covariances, 
Pu ee as arbitrary a choice of units—namely 
3 Eu Edu Be any other. But centroid axes 
NO d ves, Be principal components, without rotation, 
E pi xs missible, for they change with the units 
NM ae that such axes are the true ones is 
em ji d being dependent on the chance composition 
be ate x and the system of units which chances to 
E Nieto a nt pendence of metrie is not sufficient to 
m Rh process but it is necessary. Its absence does 
NA, H TE of factors to be wrong, but it makes it 
Eo ions T „the process by which they have been arrived 
H u à in general give the true factors. 
ee ane form a fundamental problem in 
of in m JR yen and yet they are practically never heard 
think P an analysis. Tt is reasonable enough to 
Eiai M Fe Ei may require some trick of the intellect 
factors e je f, yet it is not obvious that these specific 
E the: al " made as large and important as possible ; 
Six does Ww i. ine: plan of minimizing the rank of a 
inevitabl v ; ae excess of factors over tests which 
every LEN Kp results from postulating a specific in 
any dol means that the factors cannot be estimated with 
a a PERR Usually the accuracy is very low 
SEN a "s ae determinate and the indeterminate parts of 
Be and esi ne s factors in Primary Mental Abilities can 
Table 3 y post-multiplying Table 7 on his page 98 by 
on his page 96. We find: 


Factor Variance of the Variance of the 
Estimated Part Indeterminate Part 
SEE : . 611 -389 
EC: ! . -616 -384 
Ni: $ . 825 -175 
X . +. 662 -338 
MiS cs FAST -569 


Ww -439 “561 
lee 3 . -397 -603 
R. Pe 000 -400 
D -519 -481 
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The average for the nine factors is only 564 per cent. of 
the variance estimated. In other words the factor 
estimates have large probable errors in some cases as large 
as the estimates themselves. This has serious conse- 
quences, not to be overcome by more reliable tests. f 

Using unity for every diagonal element in the matrix of 
such a battery will give factors (supposing the same 
number of them to be taken out) which will not imitate the 
correlations quite so well, but which can be estimated 
accurately. 

In fact, whether Hotelling’s process or the centroid 
process is used, with unit communalities, each factor can 
be calculated exactly for a man, given his scores. By 
exactly we mean that they are as accurate as his scores are. 
Of course, in any psychological experiment the scores may 
not be accurate in the sense that they can be exactly 
reproduced by a repetition of the experiment. Apart from 
sheer blunders and clerical errors, there is the fact that a 
man’s performance fluctuates from day to day. But 
these errors are common to any process of calculation 
which may be used on the scores. "These are not the errors 
for which we are criticizing estimates of a man’s factors. 
The point we are making is that factors based on com- 
munalities less than unity have a further, and large, error 
of estimation, whereas factors based on unit communalities 
(even if only one or two or a few are taken out) have no 
such further error of estimation. 


If a few such factors taken out with unit communalities 


are then rotated (keeping them in the same space, i.e not 
changing their number) they still remain susceptible of 
exact estimation in a man. 

As soon, however, as any fractions, minimum or not, are 
placed in the diagonal cells, we have thereby decided to 
use, in describing our tests, more orthogonal axes than there 
are tests; for each test has then a Specific factor, and there 
are in addition the common factors. "This means in terms 
of our spatial model that none of the axes, neither the 
common factors nor the Specific factors, are in the test - 
space at all (except at the origin where they all cross). It 
is only about the test Space, of dimensions equal to the 


Specifies. ” d 
À fics. The combined communa 


SOME FUNDAMENTAL QUESTIONS 337 


number of tests, that we have any information from our 
battery. These axes are away in outer darkness and we 
cannot know them, but only their projections or shadows 
on the test space. Psychologists invariably confine their 
attention, after making an analysis using communalities 
to the “ common factor space,” of a comparatively small 
number of dimensions, without, I think, being usually 
aware that this space is not in the test space at all. (Thur- 
hee “ secondary factors," in their turn, are not even in 
e eye factor space, for he uses what I might call 
us y copununalitesi) The effect of all this is that the 
bis ors arrived at by an analysis which has begun by 
in DE gen in the diagonal cells can never be measured 
E an, ut only vaguely estimated, and with maxi- 
ae agueness if minimum communalities are used. 

d bue fact that factors can only be estimated and 
A FE e Y measured is, of course, not fatal. Through- 
m E, Ne A the idea of estimation in a realm 
LN chis w ich is experimentally known, in a realm of 
CER pu than that in which our measurements 
eres: er e., S Tt is to allow for that that the device of 
Butte rake oe is used in the analysis of variance. 
a Abena ia. ae a the vagueness due to estimation 
of ee y iam for reducing the rank of a matrix 
the E nevad involves the simultaneous maximizing of 
dio qam variances. In Section 3 of the previous chapter 
VEA "m was made to this fact that methods of 
of the E ich use communalities maximize the variance 
oM ^! factors, by reason of minimizing the number 
ss RR dt First take the case of the analysis of a 
ie al attery. As was illustrated in Chapter XX 

ysis of such a battery into one general factor only, 


and speci S : : 5 
pecifics, gives the maximum variance possible to the 
lities of the tests are 


any other analysis. 
that the trace of the 
of the cells of the 


The Ms two-factor analysis than in 
Rng YI ematical expression of this is 
Princi a a qp i.e. the sum 
It 3s diagonal, is a minimum. i 
math tue that certain exceptions to thi 
ematically possible, but their occurren 


s statement are 
ce in actual 
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psychological work is a practical impossibility. They P 
been investigated by Ledermann (Ledermann, 1940), who 
finds, in the case of the hierarchical matrix, that an excep- 
tion is only possible when one of the g saturations is greater 
than the sum of all the others. When the battery is of 
any size, this is most unlikely to occur : and almost always, 
when it did occur, the large saturation of one test would 
turn out to be greater than unity, which is not permissible 
(the Heywood case). 

The same statement as the. above, th a 
maximized, is also true in general. The communalities 
which give the matrix its lowest rank are in sum less than 
any other diagonal elements permissible. If smaller 
numbers are placed in the diágonal cells, the 
unless factors with a loading of 4/ — 1 
such factors are, of course, inadmissible 

Here again there are possibly e 
rank is not accompanied by the lowest trace (i.e. the lowest 
sum of the communalities). But here again it seems cer- 
tain that if such cases do exist, they are mathematical 
curiosities which would never occur in practice. 

If specific factors of such large size have any psycho- 
logical existence, what can they be? Possibilities which 
will occur to us are first, that they are error factors—but 
errors or variations in the subject’s performance are not 
likely to be entirely unique to one test. Secondly, they 
have been attributed to sampling errors in the coefficients 
of correlation — but these sampling errors are themselves 
correlated, and so give ri Ise common factors, not to 
specific factors, Thirdly, they may be real mental factors; 
unique to that test, needed only by it. But what remark- 
able consequences follow if we accept that. I devise 4 
brand-new test and lo, in the mind of man there exists à 
specific ability to do that test and, moreover, an ability 
Sale ders other activity. Further, every 
individual I meet possesses this Specific ability in large 
or small amount. The idea in this form is really fan- 
tastic. 

It would seem then 
unique, but only uniqu 


at the specifics are 


analysis fails 
are employed, and 


ases where the lowest 


that the specifics cannot be really 
€ in this battery, This leads to the 
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presumption that the tests of a battery possess specific 
factors only because there does not happen to be in the 
battery any other test to share the specific, or at least part 
of it, and prove it to be really one or more common factors. 
On this view, specifics will disappear when a test has been 
tried in a large number of batteries, or in a sufficiently large 
battery. Not only does this seem unlikely when one 
considers that in every battery the minimum communalities 
and maximum specifies are insisted on, but it also has 
peculiar consequences in regard to the number of primary 
factors. Consider a battery consisting of, say, two dozen 
tests, analysed into, say, seven common factors plus, of 
course, two dozen specifics. The latter, it must be re- 
membered, are all orthogonal, all uncorrelated with one 
another. On the hypothesis that they are really factors 
which just do not happen to have found a partner, like 
wallflowers at a-ball, there must exist at least two dozen 
other primary factors waiting to be discovered in a larger 
battery - And so with every battery of tests. The number 
of primary factors must be.larger than all the tests hitherto 
invented, which does not seem to be parsimonious. I can- 
not help fearing that there is something wrong with the idea 
Ps reducing the matrix of correlation coefficients or co- 
variances to its lowest possible rank, and then calling the 
descriptive variates to which this leads “ factors of the 
mind": something wrong with the whole idea of attri- 
buting as much as possible of the variance of a test to a 
"nique factor, something wrong with the * parsimony ” 
argument upon which all this is based. It leads to too 
many difficulties to which it is possible, but not, I think 
Roy isable, to shut one's cyes. Moreover, the reciprocity 
Principle, which identifies factors and loadings obtained 
qu correlating tests with loadings and factors obtained 
S correlating persons, works only when there are no 
Specifies involved. I would like to see a number of existing 
Squares of correlation coefficients re-analysed with full 
Variance in each diagonal cell and the results considered. 

here would be no guessing of the communalities, and no 
Tepetitions or iterations of the calculation to determine 
them. "Tests of significance of residues would be more 
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easily made, and although rather more factors would be 
necessary before the residues became insignificant, they 
would have the advantage of more accurate estimation in 
any man. ‘True, such factors would be confined to the 
particular test space of that battery, and admittedly a 
factor of the mind is not likely to be an exact composite of 
the tests of any one battery. Butthe point is an academic 
one, for the common-factor space in which communality 
factors exist, is just as much a creation of the particular 
battery as are axes determined within the battery 
Space. 

I must not be misunderstood as saying that no specific 
factors exist at all. What I am sceptical about is the pro- 
cedure of making the specific factors in every battery as 
large as possible, by the automatic application of a mathe- 
matical device. That every test may well have some 
unique quality for any individual person seems conceivable, 
though I do not think this Special feature of the test will 
be felt as a peculiarity by every person who tries the test. 
I think any such unique quality would be a blemish in the 
test, just as unreliability isa blemish, and that the psych- 
ologist should endeavour to make tests which are neither 
ith unique peculiarities. Prob- 
ably he cannot avoid a certain amount of uniqueness, just 


a certain amount of unreliability. But 
I do not see the need for ascribi 


order to reduce the numb 

A critic may point out that, 
parts of the tests are admitted] 
be the need for the large numb 


if even small truly unique 
Y present, there will always 
er of specifies. Possibly so 
X no great importance, if the tests are 
good ones; specifies with an influence as unimportant as 


the causes are of the residual i i i 
Ud c £ s which we in a ase ignore 
after statistical testing. ESN E 


It is true that by the use of communalities the total 
number of loadings to be estimated is reduced to a mini- 
mum. That way of putting the parsimony argument 

What I doubt is whether too 
high a price 1s not paid, since this same procedure maxi- 
mizes the specifics, and decides their importance without 


—— M — 
——— 
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any psychological consideration whatever being given to 
the question. 

The practical conclusions I would draw from these con- 
siderations about the nature of specific factors are that a 
battery used for factorial analysis should be composed of 
tests of high communality in that battery: or that, if 
tests are admitted which by the mathematical principle 
of rank reduction are allotted low communalities, the 
psychologist should agree that these tests do draw, each 
of them, upon factors of the mind not represented elsewhere 
in the battery. 

Such is the argument against minimum communalities. 

For them is the hope that some day, despite their draw- 
backs, the factors they lead to may prove to be something 
real perhaps have some physiological basis. And their 
defender may plead that the estimates of these factors are 
as good as the estimates we find useful, in predicting 
educational or occupational efficiency. 
„„ 4 Oblique faclors.—I think it is pretty certain that 
Thurstone took to oblique factors because he wants simple 
Strueture at all costs. Certainly oblique factors make it 
much easier to reach simple structure—too easy, Reyburn 
and Taylor say. It will be found far more often than it 
really exists, they add. On the other hand, Thurstone 
can point to his box example and his trapezium example 
and say with truth that simple structure enabled him 
to find * realities,” can say that the oblique simple struc- 
ture is something more real, in the ordinary common-sense 
everyday use of the word, than the orthogonal second- 
order factors which are an alternative. 

Other workers, not at all wedded to the ideas of simple 
Structure, have also declared their belief in oblique factors, 
9.8. Raymond Cattell, and, I think, many who feel inclined 
to work in terms of “clusters,” In ordinary life, weight 
?nd height are both measures of something real, although 
they are correlated. We could analyse them into two 
"ncorrelated factors a and b, or into three for that matter, 
ut certainly no one would use these in ordinary life. It 
1S, however, just conceivable that some pair of hormones 
Say) might be found which corresponded, not one of them 
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to height and one to weight, but one to orthogonal factor 
a and another to orthogonal factor b. 
to state anything more than a preference for orthogonal 


or oblique factors. Opinion is turning, I think, toward 
the acceptance of the latter. 


It is far too early 
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PARAGRAPHS 

1. Textbooks on matrix algebra. 2. Matrix notation. 
3. Spearman’s Theory of Two Factors. 4. Multiple common 
factors, 5. Orthogonal rotations. 6. Orthogonal transforma- 
tion from the two-factor equations to the sampling equations. 
7. Hotelling's “ principal components." 8. The pooling square. 
9. The regression equation. 9a. Relations between two sets 
Of variates. 10. Regression estimates of factors. 10a. Leder- 
mann’s short cut. 11. Direct and indirect vocational advice. 
12. Computation methods. 13. Bartlett's estimates of fac- 
tors, 14, Indeterminacy. 15. Finding g saturations from an 
Inperfectly hierarchical battery. 16. Sampling errors of 
tetrad-differences. 17. Selection from a multivariate normal 
Population, 17a. Maximum likelihood estimation (by D. N. 
ey): 18. Reciprocity of loadings and factors in persons 
19, traits, 19. Oblique factors. Structure and pattern. 
The Second-order factors. 20. Boundary conditions. 21. 

© sampling of bonds. 


I E Textbooks on matriæ algebra.—Some knowledge of 
matrix algebra is assumed, such as can be gained from the 
pthematica] introduction to L. L. Thurstone's Multiple 
no Analysis (Chicago, 1947) ; Turnbull and Aitken's 
NU of Canonical Matrices, Chapter I (London and 
M BOW, 1932); H. W. Turnbull’s The Theory of Deter- 
ND Matrices, and Invariants, Chapters I-V (London 
Hi, Glasgow, 1929); and M. Bocher’s Introduction to 
‘Eher Algebra, Chapters II, V, and VI (New York, 1936). 
and Ev adopted Thurstone’s notation in Sections 19 
XI 19a of the mathematical appendix, and in Chanter 
mca and XIII in describing his work. But I have a 
yee the change elsewhere because readers would then be 
mmoded in consulting my own former papers. 
he chief differences are as follows : 
imd M is Thurstone’s F, for centroid factors, 
tstone’s S + A/N, and my F is Thurstone's P — 
345 


my Z is 
yN. 
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2. Matrix notation —Let X be the matrix of raw scores 
of p persons in n tests, with n rows and p columns; and 
when normalized by rows, let it be denoted by Z. The 
letters z and Z in the teat of this book mean standardized 
Scores, which are used in practical work, but in this 
appendix they mean normalized scores, so that 


ZZ =È ADU a Ne 
the matrix of correlations between n tests. 

For many purposes it is convenient to think of solid 
matrices like Z as column (or row) vectors of which each 
element represents a row (or column) Thus Z can be 
thought of as a column vector z, of which each element 


represents in a collapsed form a row of test scores. Thus 
with three tests and four persons— 


ži žu eg 
p 
Po Wei Psi Bot 255 8 Ras (2) 
z: | E Rigg 


In the theory of mental factors each score is represented 
as a loaded sum of the normalized factors f, the loadings 
being different for each test, i.e.— 


z = Mf (specification equations) a © 
where M is the matrix of loadings and f the vector of ? 


factors, collapsed into a column from F, the full matrix, 
of dimensions v x p. 


We note that p — number of persons, 
n = number of tests, 
v = number of factors. 


The dimensions of M aren X v. Equation (3) represents 


n simultaneous equations, and the form Z — M F represents 
^p simultaneous equations. 


We now haye— 
R=ZZ' = (MF)MFy = MFF'M' 
If the factors are orthogonal, we haye— 
V cU A TNR DAS 
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the unit matrix, and therefore— 


RUM a o Pe ES 
The resemblance in shape between this and— 
RAGE - r T a y 


leads to a parallelism between formulæ concerning persons 
and factors (Thomson, 1935b, 75 ; Mackie, 19284, 74, and 
1929, 34). 

3. Spearman’s Theory of Two Factors assumes that M 
is of the special form— 


Me + + | ame 1 . (0) 


m 


zel 


and therefore— 
R—-W-LM; . : 3— 16) 
Where M, is the diagonal matrix which forms the right- 
hand end of M, and lis the first column of M. In this 
form it is clear that R is of rank 1 except for its principal 
diagonal. Its component ll’ is the “ reduced correlational 
matrix ” of the Spearman case, and is entirely of rank 1. 
The elements 12, 12, .. - 1,2, which form the principal 
diagonal of Il’, are called “ communalities.” 
4. Multiple common factors.—When more than one 
common factor is present, M takes the form— 
Wait) AS O) 
where M ois the matrix of loadings of the common factors, 
represented in the Spearman case by the simple column J. 
We have then— 

R = MM' =M.M,+™M, - . A19) 
where the “reduced correlation matrix ” MyMo NE 
rank 7, the number of common factors, and is ME 
with R except for having “ communalities ” in its principa 
diagonal, 

5. Orthogonal rotations.—If we express the v 
terms of w new factors 9 by the equation— 


f= Ag 


factors f in 


(11) 
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where A is a matrix of v rows and w columns, we have— 

z = Mf = MAo Be L3 v2) 
an expression of the tests z as linear loaded sums of a 


different set of factors, with a matrix of loadings M A. 
If— 


AA' =I : K $ F . (18) 


the new factors o are orthogonal like the old ones. They 
can be as numerous as we like, but not less than the number 
of tests unless the matrix R is singular. (12) represents a 
rigid rotation of the orthogonal axes f into new positions, 
with dimensions added or abolished. ; 
6. The sampling theorj.—The following transformation 
is of interest as showing the connexion between the 
Theory of Two Factors and the Sampling Theory (‘Thom- 
son, 1935b, 85). We shall write it out for three tests only, 
but it is quite general. Consider the orthogonal matrix : 


Uli mll Iml lm) mml mm Imm} mmm! 
ee: NETS SER p M d te aime aes S CON leni 
| mili U mm mm | —lml —im mmm, —Imm 
Imi mm lll imm; —mll mmm —lim i —mlm | 
| Um} mim Imm =l! mmm —mll —Iml! —mml (14) 
E re tI ot ih Ms ans <| (14 
mml; —iml —mil mmm | lll —Imm —mln I Um 
mmi —llm mmm  —mili — Imm ul —mml: Imi 
1 1 | 
Imm | mmm —lim —iml! —mim —mml Ul; mll 
---——- Mitac su IL e TN || ET 
| mmm | —Imm —mlm —mml| üm imi mil; —Ill 


. 


wherein the omitted subscripts 
understood as existing always in 
means mijl,l,. 

If we take for 4 in Equation 
of this orthogonal matrix, and fo 
(7) with three tests, 
factors, yielding : 


1, 2, and 8 are to be 
that order, so that mll 


(12) the first four rows 
r M the Spearman form 
the result is to transfer to eight new 


A = ll, + mp, + lmp, + MMP, 
22 lp, + mig, + lm, + MMs s . (8) 
25 = Lhe, + mi, + lm, + MMP; 

Each z is here in normalized units. If, however, we 


change to new units by multiplying the three equations 
by L, l, and 1, respectively, we have : 
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he, = Gleb, + Lmalsps + Llmsps + Lm; 
lza = Vala, + MlalaPe + hama, + MlM Pa . . (16) 
bzs = llli, + mulla + lSymjlps + MMaLsPs 


and the variates lz, loz, and l% are now susceptible of 
the explanation that each is composed of lN small equal 
components drawn at random from a pool of N such 
components, all-or-none in nature. In that case 1,/,*1,°N 
components would probably appear in all three drawings 
(1) ; ]%1,2m,2N components would probably appear in the 
first two drawings, but not in the third (94) ; and so on 
down to m,m, m, components, which would not appear 
at all (o,, which is missing from the equations). 

The transformation can, of course, be reversed, and the 
sampling theory equations converted into the two-factor 
equations. / 

7. Hotelling's “ principal components” are the principal 
axes of the ellipsoids of equal density— 

z'R-5; = constant . i . (17) 
When the test vectors are orthogonal axes (Hotelling, 1933). 
To find the principal axes involves finding the latent 
Toots of R-', The Hotelling process consists of (a) a 
rotation of the axes from the orthogonal text axes to the 
directions of the principal axes; and (b) a set of strains 
and Stresses along these new axes to standardize the factors, 
Eng the ellipsoid spherical and the original axes oblique. 

5. transformation from the tests to the Hotelling factors 

Y being from Equation (3)— 
z = My (M square) 
ds ellipsoids (17) become— 
_ constant = z'R-!g = YOU'RAM)yY-—wYY > (18) 
Since they become spheres. Therefore we must have— 
MR ML MT ON c 19) 
, The locus of the mid points of chords of z'R™z whose 


. tection cosines are h’ is the plane h’R7's = 0, and if this 


1. principal plane it is at right angles to the chords it 
Iseets, i.e.— 


VR =r 
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which has non-trivial solutions only for— 

| &?! —AXI| «0 
the roots à of which are the “latent roots" of R-!, while 
each h’ is a “ latent vector.” 


Now, if H is the matrix of normalized latent vectors of 
R™", we have— 


H'R-H—A 
—-1. 
Where A is the diagonal matrix of the latent roots of R^; 
so that a solution for M corresponding to rotation to the 
principal axes and subsequent change of units to give a 
sphere is seen to be— 
M —HA- : . (20) 
The latent vectors of R are the same as those of mà 
or of any power of R, and Hotelling's process describe 
in the text (Chapter VIT) finds the latent roots (forming the 
diagonal matrix D) and the latent vectors (forming H) o 
R. We then have — 
MEO, . , (21) 
For the convergence of the process, 
of 1933, pages 14 and 15. : 
Since in Hotelling analyses M is Square, we can write— 
Y — M72 = (HD), 
ets DODU DMR (22) 
Each factor Y, that is, can be found from a column of 


the matrix M, divided by the corresponding latent root, 
used as loadings of the test scores 


see Hotelling’s paper 


cn 
8. The pooling square.—If the matrix of correlations of 
a + b variates is : 
R 
aa Ry y . f (23) 
Rya R,, 


and if the standardized varia 
u, the standardized vari 


composite scores, the 


resulting variances and covariances are : 


UR w'R pw 
Suy Ue (24) 
uwR wu w'R,,w 
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as can be seen by writing out the latter expressions at 

length. The battery intercorrelation is therefore— 
es or wRr : . (85) 
(w Ruu X oo Ry) 

If weights are applied to raw scores, each applied weight 
must be multiplied by each pre-existing standard deviation, 
in (25). 

If there is only one variate in the a team, (25) becomes— 


B š 3 . (26) 
/ (zo Rz) 
Where Tya represents a whole column of correlation coeffi- 
cients. The values of w for which this reaches its maximum 
value will satisfy the equation— 
ys 
x A = a (27) 
w W LyyW 
that is— 3 
; w = a scalar X Ey ry (28) 
consistent with the ordinary method of deducing regression 
coefficients. 
A 9. The regression equation.—If z is the one variate in 
he æ team, and z are the b team, and if — 
Z=ws . > . . (29) 
We wish to make S(z, — £j)? a minimum, that is— 
ò 
—S(z) — wz) = 
dw (o 2) 
Sz 2" = w'Szz’ 
w = Ta Ry 
o = Ta Ro z (30) 
If R is the matrix of correlations of all the tests including 
na the regression estimate of any one of the tests from a 
Veighted sum of the others is given by— 
determinant R, = 0 . . (81) 
EE R, is R with the row corresponding to the variate 
© be estimated replaced by the row of variates. i 
1 9a. Relations between two sets of variates. —(Hotelling 
935a, 1986, M. S. Bartlett 1948). If two sets of variates 
ave correlation coefficients— 


352 MATHEMATICAL APPENDIX 
Ria Ry, A 


or 
Rya Ry, Cc 


and if the variates of the B team are fitted with weights " 
then the correlations of the B team, thus weighted, with 
the separate tests of the 4 team are given by— 


m OUS ES 
1777 


and the square of the correlation coefficient between the 
two teams is then— 
D NUMAE n rut (31.2) 
b'Bb 


The maximum intercorrelation, and other points of in- 
flexion in 2, will be given by— 


di/db = Q i 
hen (CAE — Bp —o0 .° . '.(818) 


a set of homogeneous equations in b. We must therefore 
have— 


[Cae —aB|-o0 . .  .(8- 
an equation for à with as many non-zero roots as the num- 
ber of variates in the smaller team. For any one of i 
roots 2, the weights b are proportional to the co-factors © 
any row of (CA~1C’ — B). The corresponding weights 
a for the A team are then found by condensing the team B 
(using weights b) to a single variate and carrying out an 
ordinary regression calculation. H 

The result is to * factorize ? 
orthogonal axes as there are var 
lated to one another in pairs cor 
Each axis is orthogonal to all 
opposite number in the space 
from the same root 2 as it does, 


at an angle arecos 4/5. 


each team into as many 
iates. These axes are re- 
responding to the roots à- 
the others except its own 
of the other team, arising 
to which axis it is inclined 
Where one team has m more 
variates than the other, m of the roots will be zeros and 
the corresponding axes will be at right angles to the whole 
space of the other team. This form of factorizing has been 
called by M. S. Bartlett (1948) external factorizing, since 


MATHEMATICAL APPENDIX 358 


the position of the “ factors ? or orthogonal axes in each 
team, in cach space, is dictated by the other team. 

The weightings corresponding to the largest root give 
the closest possible correlation of the two weighted teams. 
If the two teams are duplicate forms of the same tests, this 
is the maximum attainable battery or team reliability 
(Thomson 1940, 1947, 1948). In this case Peel (Nature, 
1947) has shown that a simpler equation than 31-4 gives 
the required roots. If X = p? Peel’s equation is— 

(C— pai om s . (81.5) 
Where A differs from C only in the diagonal elements, which 
in A are unities but in C are reliabilities r;; of the individual 
tests. 

Green (1950) gives a transformation of this equation 
which enables Hotelling's iterative process (see Chapter 
VII) to be used to find y, the maximum battery reliability. 
For the diagonal elements r; — y. of the matrix (C — uA), 
Green writes— 


[ Ti + E |e —r,)( — y) 


ae. l-—u 


When 31:5 becomes equivalent to— 
| DCD — I| =0 F . . (81.6) 
wherein D is a diagonal matrix with elements (1 — 7i) 5 
Tis the unit matrix, and @ = p/(1— y). The latent vector 
i corresponding to the largest latent root of DCD can then 
be found by Hotelling’s process, and the best weights for 
maximum battery reliability are proportional to D y -W. 
The maximum reliability thus attained is— 
u= W'CW/W'AW 


10. Regression estimates of factors. —When in t 
fications— 


he speci- 


gfe ; 3 . (83) 

the factors outnumber the tests, they cannot be measured 

ut only estimated. To all men with the same set of 

Scores z will be attributed the same set of estimated factors 

> though their “ true ” factors may pe different. The 

Tegression method of estimation minimizes the squares of 
F.A,—]19 


354 MATHEMATICAL APPENDIX 


the discrepancies between f and f, summed over the men. 
The regression equation (31) will be for one factor f,— 


E —À (M . (82) 


where m; is a column of M. Expanding, we have— 


fi = mi Rz 
and in general— 


Í-—MR-^& . i . (83) 

or, separating the common factors and the specifics— 
fe—MQyROR . , . (84) 
A=M4R3z2 ž . ; . (85) 


the latter of which shows that we know the proportionate 
weights for each specific (the rows of R~) even before we 
know whether that specific exists (Wilson, 1934, 194). 
The matrix of covariances of the estimated factors is— 
M'R3M = M’ RM, ps 


M,R-M, M,R-M, (36) 


a square idempotent matrix of order equal to the number 
of factors, but trace only equal to the number of tests. 


For one common factor, (34) reduces to Spearman’s 
estimate— 


pies tek ee) 
1 1— rj 
where Sr ^w 
ha 


while K = MRM, in (36) 
variance of g, 

10a. Ledermann’s short cut (1938a, 1939b).—The above 
requires the calculation of the reciprocal of the large square 
matrix R. Ledermann's short cut only requires the reci- 
procal of a matrix of order equal to the number of common 
factors. As long as the factors are orthogonal we have— 


R=MM/+M?. . . (10) 


reduces to S/(1-- S), the 


and the identity 
M,'M,~*(M,M,' My) = (My'M,~*M, + I)M,’ 
=(J + DM say. 


SS 
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Premultiplying by (I + J)~' and postmultiplying by R- 
we reach (I + J)“ M; M, ~ = MyR^. s . (36.1) 
and the left-hand quantity can then be used in Equation 
(34). 

This short cut requires modification when the factors are 
oblique. See Equations (70.1) to (70.4) below. 

11. Direct and indirect vocational advice.—If zy is an 
occupation and z a battery of tests, the estimate of a 
candidate's occupational ability is— j 

dye TRR : : . (87) 
where the rọ are the correlations of the occupation with the 
tests. If z, can be specified in terms of the common 
factors of z, and a specific sg independent of z, then an 
indirect estimate of z via the estimated f, is possible. We 
have— 

$5 =—Mmfots e . . (88) 
Where my’ is a row of occupation loadings for the common 
factors fy of z, and also— 

fo = MR2 

Substitution in (88) assuming an average sọ (= 0) 
gives— 

= my My Rz á . . (39) 

But— 

me, M, =r ‘ . (40) 


and (89) is identical with (37) (Thomson, 1936a). If, how- 
Ever, sọ is not independent of the specifies s of the battery, 
(40) will not hold, and the estimate (39) made via an estima- 
on of the factors will not agree with the correct estimate 


12: Computation methods.—The ** Doolittle ” method of 
Computing regression coefficients is widely used in America 
(Holzinger, 1937a, 82). Aitken’s method, used. and 
€xplained in the text, is in the present author's opinion 
Superior (Aitken, 19374 and b, with earlier references). 

€gression calculations and many others are all Special 
Cases of the evaluation of a triple matrix product DOLLS: 
Where Y is square and non-singular, and X and Z may 
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be rectangular. The Aitken method writes these matrices 
down in the form— 


and applies pivotal condensation until all entries to the 
left of the vertical line are cleared off. All pivots must 
originate from elements of Y. By giving X and Z special 
values (including the unit matrix J) the most varied 
operations can be brought under the one scheme. 

18. Barilett’s estimate of factors —We have z = Mofo + 
M, fi, where fy and f; are column vectors of the common 
and specific factors respectively and M, is a diagonal 
matrix. Bartlett now makes the estimates fy such as will 


minimize the sum of the squares of each person’s specifics 
over the battery of tests, i.e.— 


DE. 
ar o =0 


- (Be 


ie.— 
(—M,7Mj) (M,7s — M,-M, fo) = 0 
MyM,7 = Mi MM fa 
— Jfo, say 
Fom I MM R (AT) 
(Bartlett, 1987a, 100.) 
One could also find the estimated specifics as: 


Å = (I — MMJ MiM )M, z . 0. (42) 
Substituting— 


we get for the relation between f and f— 
2l a i J- M/M, f 
Zo E fe a ae aom eae Jo} _ 43 
[2 w <A M, usas [£] Af (4 ) 


Py aE 710 RM | 


(44) 
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The error varianees and covariances of the common 


factors are— 
Us — ffo — f = J Mo's P) Mod 
= JMM MJ E es E) 
(Bartlett, 1937a, 100.) 


When there is only one common factor, J becomes the 


familiar quantity— 


2 


l-— 
(Bartlett, 1985, 200.) 
As was first noted by Ledermann em 
I4 J^ = (MRM) = K- (46) 
cies by Thomson, 19384); and using this we see that 
he back estimates of the original scores from the regression 
estimates Jo are identical with the insertion of Bartlett’s 
estimates fo in the common-factor part of the specification 
equations, viz.— f 
M,KOM,)/R2 = Myg- MM cae . (47) 
(Thomson, 1938a.) 
Bartlett has pointed out that, using the same identity, in 
the form K = J(I — K), it is easy to establish the rever- 
sible relation between his estimates and regression esti- 


mates— 
5 . (48) 


fo = Kf, fo =K~fy : 
(Bartlett, 1988) 
and he summarizes their different interpretation and prop- 
erties by the formulee— 
E(f, = BL fo} = 0. E(fo — ffo 
Ef) = fü» Ex{(fo — fofo — fo) =J 
Dea e . (0 
ver all persons, Æ, over all 
ble with the given set in 
tion on the group 


E E E 


where E denotes averaging o 
possible sets of tests (compara 
regard to the amount of informa 

factors fo). 
14. Indeterminacy. 
the factors outnumber the tes 
* Letter of October 23, 1937, to Thomson. 


mated factors, if 


— "The fact that esti 
ly have less than 


ts, necessari 
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unit variance has sometimes been expressed in the case of 
one common factor by postulating an indeterminate 
vector i whose variance completes unity. This i may be 
regarded as the usual error of estimation, and is a function 
of the specific abilities (Thomson, 1984, B.J.P., 25, 92). That 
M'R™M in Equation (36) is of rank less than its order also 
expresses the indeterminacy, and allows the factors to be 
rotated to different positions which nevertheless fulfil all 
the required conditions. In the hierarchical case the 
transformation which effects this is (Thomson, 1935a)— 


tom Mae. du) 


where B means the required number of rows of — 


B=I—2qq'/q'q_ . : . (52) 
in which— 
d; = l;/m; (see Equation 7) - . (58) 
as far as there exist tests, after which qis arbitrary. 
For— 
z = Mf = MBo = Mo 
since— 


f MB =M. : . (54) 
and z is thus expressed by ide 
in terms of new factors 9. F 
case of multiple factors se 
Ledermann, 1938c. 


If the matrix M is divid 


ntical specification equations 
or such transformations in the 
€ Thomson, 1986a, 40 ; and 


ed into the part M, due to 
common factors and the part M, due to specifics, as in 
Equation (9), then Ledermann Shows that if U is any 


orthogonal matrix of order equal to the number of com- 
mon factors, the matrix. 


B=I— QRR) — U)Y(Q'Q)-1Q' 


: —I 
wherein— Q= MM, 


will satisfy the equation— 
MB = M 


acy is entirely due to the excess of factors 
e. to the fact that the matrix of loadings M 


Indetermin: 
over tests, i. 
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is not square. It can be in theory abolished by adding 
a new test which contains no new factor, not even a new 
specific; or a set of new tests which add fewer factors 
than their number, so that M becomes square (Thomson, 
1934b ; 1985a, 253). In the case of a hierarchy each of 
these tests singly will conform to the hierarchy, so that 
their saturations J can be found; but jointly they break 
the hierarchy. If they add no new factors, g can then be 
found without any indeterminacy. 

15. Finding g saturations from an imperfectly hierarchical 
battery.—The Spearman formula given in Chapter III, 
Section 5, is the most usual method. A discussion of other 
methods will be found in Burt (1936, 283-7). See also 
Thomson (1934a, 370), for an iterative process modified 
from Hotelling. 

16. Sampling errors of tetrad-differences.—Yhe formule 
(16) and (164) given in the text are both approximations, 
but appear to be very good approximations. The primary 
Papers are Spearman and Holzinger (1924 and 1925). 
Critical examination of the formule have been made by 
Pearson and Moul (1927), and Pearson, Jeffery, and Elder- 
ton (1929). Wishart (1928) has considered a quantity P 
which is equal to P’N2/(N — 1)(N — 2), where P' is the 
tetrad-difference of the covariances a instead of the correla- 
tions, and obtained an exact expression for the standard 

€viation o of P— 


N+1 
N — 2)g: = 
( iad ie 


where the D’s are determinants of the following matrix 
and its quadrants : 


DyDy, D T 8DiDa : (58) 


l 
| Gau 4% ! Gs a 
aa Gn y Qag 
aa a ae. Te 
| Gg, Qa | Qas 38 
| 2a Qa | Qas Maa | 
4 


But &pproximate assumptions are necessary when the 
Standard deviation of the ordinary tetrad-difference of the 
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correlations is deduced from that of P. The result for 
the variance of the tetrad-difference is— 
NET a j s 
a -—fn)l-r)-mR . (56) 
(N-0N-s Tg?( 34°) 


where R is the 4 x 4 determinant of the correlations. 

17. Selection from a multivariate normal population.— 
The primary papers are those of Karl Pearson (1902 and 
1912). The matrix form given in the text (Chapter XIX, 
Section 2) is due to Aitken (1934), who employed Soper’s 
device of the moment-generating function, and made a 
free use of the notation and methods of matrices. A 
variant of it which is sometim 
Ledermann ( 
If the origin 
manner : 


es useful has been given by 
Thomson and Ledermann, 1938) as follows. 
al matrix is subdivided in any symmetrical 


| m Ky, Re R $1 
| 
| 
| 
| 


and R, is changed by selection to Vp then each resulting 
sub-matrix, including V,,, itself, is given by the formula— 


7j = Rap — R ER } 

aß aß ap — pp Vpf (57) 
: =] -1 -1 dà 
where— 19 Ry— C VyE,, 


lTa. Maximum likelihood estimation.—The maximum 
likelihood equations for estimating factor loadings (Lawley, 
1940, 1941, 1943b) may be expressed fairly simply in the 
notation of previous sections. It is necessary, however, 
to distinguish between the matrix of observed correla- 
tions, which we shall denote by Ro, and the matrix— 


R = MM. + M, 


which represents that part of Ry which is “ explained ” by 
the factors. 


The equations may then be written— 


My = MRR, (58) 
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r 
T hes ar . Ü 
se are not very suitable for computational work. 


It may, however, be shown that— 
WRA = — K)M,M;? = +9) 
where, as before, 
Hence a F CE. 
ciim equations may 


-MyM (59) 


J =M; M; Mo 


be transformed into the 


Mi= 0 +9) Me™ (60) 
or alternatively, 
Më = JTM’ M Bo — My) (61) 
hoes there are two or more general factors the above 
Baie mu have an infinite number of solutions corre- 
rp. g to all the possible rotations of the factor axes. 
a que solution may, however, be found such as to 
a © Ja diagonal matrix. 
Finally, if we put— 
L-—MyM, R,— My 
V = LM, ° Mo 


then, from the last set of equations 
V — JM; M; M, = J* 


Hence we have— 

My = VOL A : . (62) 
ve been found the most convenient in 
be solved by an iterative process. 
M, and M, have been ob- 
ide second approximations 


d side. 
d factors in persons and 


These equations ha 
practice, since they can 
When first approximations to 
mnes they can be used to prov 
y substitution in the right-han 
; 18. Reciprocity of loadings an 
raits (Burt, 1937b).—Let W be a matrix of scores centred 
both by rows and columns. Its dimensions are traits X 
persons (t . p). and its yank is 7 where 7 is smaller than 
both ¢ and p in consequence of the double centring. The 
two matrices of covariances are WW’ for traits and W 'W 
for persons, and by a theorem d by Sylvester 


E first enunciate 
in 1883 (independently discovered by Burt), their non-zero 
If the 


latent roots are the same. ir dimensions differ, 
Le. t+ p, the larger one will have additional zero roots. 


3.A,—12* 
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where the reference-vector pattern has been entered 
by analogy but could easily be independently found. 
It wil be seen that the structure and pattern of the 
primary factors are identical with the pattern and struc- 
ture of the reference vectors except for 
matrix D. The structure of the one 
other multiplied by D. 

This theorem is not confined to the c 
structure, but is more gener 


the diagonal 
is the pattern of the 


ase of simple 
al, and applies to any two sets 
of oblique axes with the same origin O, of which the axes 
of the one set are intersections of “ primes " taken r — 1 
at a time in the space of r dimensions, and the axes of the 
other set are lines perpendicular to those primes. By 
prime is meant a space of one dimension less than the whole, 
i.e. Thurstone's hyperplane. The projections of any point 
. Pon to the one set of axes are identical with the projections 
thereon of its oblique co-ordinates on the other set, which 
sentence is equivalent to the matrix identities (see 70)— 
FA—FAD- xD 
and — F(A)-!p = F(A‘) xD 
a Structure } _ Pattern a zi { Cosines to project it 
` | on to the first set. 
obvious in the two-dimensional case 
hesituation. A perspective diagram 
of the three-dimensio very difficult to make 
‘ 2 minating. The vector (or test) OP 
is the “ resultant ” of its oblique co-ordinates (the pattern), 
but not of its projections (the structure). It is of interest 
to notice that, either on 


; the reference vectors or on the 
primary factors— 
Pattern x 


on one set other set 


"Transpose of Structure — Test-correlations. 
This serves as a useful 


j i check on calculations. It is geo- 
metrically immediately 


defined by n oblique axes, 
points P and Q each at unit 
tions OP and OQ may b 
to two tests, and cos POQ 

Consider the pattern, on these axes, of OP, and the 
structure, on the same axes Ob OQ! “Tie former is con- 
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posed of the oblique co-ordinates of the point P, the latter 
of the projections on the axes of the point Q, which pro- 
jections (OQ being unity) are cosines. Then the inner 
product of those oblique co-ordinates of P with these cosines 
obviously adds up to the projection of OP on OQ, that is 
to cos POQ, or the correlation coefficient. : 
In estimating oblique factors by regression, since the 
correlations between factors and tests must be used, the 
relevant equation is 
fo UU) DVRs : . (70.1) 
Ledermann's short cut (section 10a above) requires consider- 
able modification for oblique factors. We no longer have 
BEIM GE LU XS 
but 
E Pattern x transpose of structure + M;?* = R 
l.c. in Thurstone's notation 
QSADOYFQqA)7DY + Fe =R . (70.2) 
and using this (Thomson, 1949), we reach the equation 
if =(I + J) HEA) DYT” : . (70.3) 
where now 
J ={F A) DYF (FAD) .  . (70.4) 
in place of Ledermann's J = M o M° Mo 
Only reciprocals of matrices of order equal to the 
number of common factors are now required, but the 
calculation, like all concerning oblique factors, is still one 
of considerable labour. à 
19a. Second-order factors —The above primary factors 
can themselves in their turn be factorized into one, two, or 
more second-order factors, and a factor-specifie for each 
Primary. If the rank of the matrix of intercorrelations 
of the primaries can be reduced by diagonal entries to say 
two, then the r primaries will be replaced by r + 2 second- 
Order factors which will no longer be in the original 
Common-faetor space. The correlations of the primaries 
With these second-order factors will form an oblong matrix 
With its first two columns filled, but each succeeding 
Column will have only one entry corresponding to a factor- 
Specific, thus : 
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EIE 
SISSE 
= 


| = E (say), 
| 
where subscripts must be supplied to indicate the primary 
(the row) and the second-order factor (the column). 

The primary factors can be thought of as added to the 


actual tests, their direction cosines being added as rows 
below F, which thus becomes : 


S I 
i 


Imagine this matrix post-multiplied by a rotating matrix 
Y, with 7 rows and r+ 2 columns, which will give the 
correlations with the ; +2 second-order factors. The 


lower part of the resulting matrix will be Æ, which we 
already know. That is— 


DAY — RF AME ME (0) 
Y = AD“E i s (2) 


and the correlations of the original tests with the second- 
order factors are then : 


G = py —FAD-!E = yp-g . (8) 
G is both a Strueture and a pattern, with continuous 


of columns equal to the 
i s d part forming an orthog- 
onal simple structure, 


20. Boundary conditions. "These refer to the conditions 
under which a matrix of correlation coeffieients can be 


explained by orthogonal factors Which run each through 
only a given number of tests. The problem was first 
raised by Thomson (19195) and a beginning made with 

à Pson, Appendix to Thomson’s 
paper). Various papers by J. R. Thompson culminated 
in that of 1929, and see also Black (1929). Thomson 
returned to the problem in connexion with rotations in the 
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common-factor space (Thomson, 1936b), and Ledermann 
gave rigorous proofs of the theorems enunciated by 
Thomson and Thompson and extended them (Ledermann, 
1936). A necessary condition is that if the largest latent 
root of the matrix of correlations exceeds the integer s, 
then factors which run through s tests only and have zero 
loadings in the other tests are certainly inadequate. This 
rule has not been proved to be sufficient, and when applied 
to the common-factor space only it is certainly not suf- 
ficient, though it seems to be a good guide. Ledermann 
(1936, 170-4) has given a stringent condition as follows. 
If we define the nullity of a square matrix as order minus 
rank, then if it is to be possible to factorize orthogonally a 
matrix of R rank 7 in such a way that the matrix of load- 
ings contains at least r zeros in each of its columns, the 
sum of the nullities of all the r-rowed principal minors of 
R must at least be equal to r. 

21. The sampling of bonds.—The root idea is that of the 
complete family of variates that can be made by all possible 
additive combinations of bonds from a given pool, and 
the complete family of correlation coefficients between 
Pairs of these. Thomson (1927b) mooted the idea and 
worked out the example quoted in Chapter XX. He 
had earlier (1927a) showed that with all-or-none bonds the 
most probable value of a correlation coefficient is (PPa); 
Where the p’s are fractions of the whole pool forming the 
variates, and the most probable value of a tetrad-difference 
^ Zero. Mackie (1928a) showed that the mean tetrad- 
difference is zero, and its variance, for E 
um wo Lo + Papa + PaPa + Pss — P(P:PaPs 
+ PiPPa + pipapa + fosa) + 4p P2PsPa 


E wo (L—») > P) — pat — Pa) 
ne N is the number of bonds in the whole pool. iam 
“und for the mean value of ry the value (pip), and tor 


Its variance. 


, Opp 


rp N-—1 
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This is not the variance of all possible correlation 
coefficients, but of those formed by taking fractions p, and 
fs» of the pool. The whole family of correlation coefficients 
will be widely scattered by reason of the different values 
of p, “rich” tests having high correlations, and those 
with low p, low correlations. Mackie (1929) next extended 
these formule to variable coefficients (i.e. bonds which no 
longer were all-or-none). He again found the mean value 
of F to be zero, and for its variance— 


,4N —3(N — 2) f2 2\12  2(N — 1) 
gc m i -3j um 


The presence of- in this is due to Mackie’s limitation to 


positive loadings of the bonds. Thomson (19355, 72) 
removed this limitation and found— 


Similarly, Mackie found for variable positive loadings 
(1929)— 


hdi 2\2 
s eg pre e 
and for all loadings Thomson found (1935b)— 


1 

(o LE E 

N 
Thomson Suggested without proof that in general, when 
limits are set to the variability of the loadings of the bonds, 
resulting in a family of correlation: coefficients averaging 7, 
these correlations will form a distribution with variance— 


al 2 
s! = — 79) 
and will give tetrad-differences averaging zero with a 
variance— 


¢* BO — DON — 3) 


N: {ra - nr + Da — 
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Summing up, Thomson says (19355, 77-8): “ The sam- 
pling principle taken alone gives correlations of all values 
. and zero tetrad-differences if N be large. Fitting the 
sampled elements with weights . . - if the weights may 
beany weights . . . destroys correlation when N is infinite. 
This means that on the Sampling Theory a certain approxi- 
mation to ‘all-or-none-ness’ is a necessary assumption 
—not to explain zero tetrad-differences, but to explain 
the existence of correlations of . . . large sizé..++ Lhe 
most important point in all this appears to me to be the 
fact that on all these hypotheses the tetrad-differences tend to 
vanish. This tendency appears to be a natural one among 
Correlation coefficients.” 

A tendency for tetrad-differences to vanish means, of 
course, a still stronger tendency for large minors of the 
correlational matrix to vanish. In more general terms, 
therefore, Thomson’s theorem is that in à complete family 
of correlation coefficients the rank of the correlation matrix 
tends towards unity, and that a random sample of variates 
from this family will (in less strong measure) show the 
Same tendency. 
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107 ff. ; Spearman, 100; Thur- 
stone, 103. 


B, Holzinger's, 20. 

Bailes, 258, 376. 

Baron, 184. 

Bartlett, M. S., 124, 218, 242, 
255, 318, 317, 327, 352, 356, 
370. 

Bifactor analysis, 19 ff. 

Binet tests, 4. 

Bipolar factors, 139, 331. 

Black, 366, 371. 

Blair, 185. 

Blakey, 122, 871. 

Bócher, 345. 

Bonds of the mind, 309 ff., 330, 
367. 

Boundary conditions, 167, 366. 

** Box " correlations, 180. 

Brown, 7, 18, 37, 49, 371. 

Burt, 68, 371 ; analysis of covari- 
ance matrix, 329;  bifactor 
analysis, 23; bipolar factors. 


331; correlation between per- 
sons, 249; g saturations, 359 ; 
marks of examiners, 256; 
* powering" a matrix, 113; 
standard error of loadings, 123, 
124; temperamental types, 
271; test and person factors, 
263 ff., 361. 


Cattell, R. B., 26, 28, 169, 341, 
371. 

Centroid analysis, 63 ff.; geo- 
metrieal picture, 104; and 
pooling square, 216; sign 
changing, 71. 

Cluster analysis, 26 ff. 

Coefficient, correlation, see Cor- 
relation ; of belonging, 20; 
reliability, 37. 

Common-factor space, 102, 117, 
337. 

Communalities, 66, 69, 119, 217 ; 
estimation of, 84 ff.; and 
minimum trace, 337 ; unique, 
81. 

Contours of population density, 
ellipsoidal, 93 ; spherical, 97. 

Coombs, 121, 871. 

Correlation, of estimates, 237; 
maximum, of two patteries, 
351; partial, and selection, 
282; between persons, 249 ff. ; 
of regression coefficients, 211 ; 
residual, 44; specific, 45 ; of 
sums, 197; see also Multiple 

correlation. 
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Correlation coefficients, attenu- 
ation, 37, 116; as cosine, 95, 
103; and covariances, 335; 
definition, 5; as estimation 
coefficients, 195 ; inconsistent, 
51,215; multiple, 197 ; nega- 
tive and positive, 316 ; spatial 
representation, 98. 

Cosine, as correlation coefficient, 
95, 103. 

Covariance, and correlation co- 
efficients, 335; of estimated 
factors, 238 ff. ; and Lawley's 
loadings, 333; matrix, anal- 
ysis of, 264, 329, 331; and 
oval diagrams, 11; of re- 
gression coefficients, 911. 

Criterion, 197; most predict- 
able, 198, 218, 351. 


Davey, 17, 371. 

Davies, 249. 

Davis, 124, 126, 371. 

Degrees of freedom, 38, 337. 

Determinants, evaluation of, 65, 
205; and inconsistent cor- 
relations, 52, 216 ; minors of, 
64. 

Direction cosines, 160, 173, 181. 

Dodd, 371. 

Doolittle, computation of re- 
gression coefficients, 355. 


Elderton, 40, 359, 374. 

Ellipsoids of density, 93. 

Emmett, 41, 42, 127, 282, 371. 

Essays, examiners’ marks, 253, 
256. 

Estimates, correlation of, 237. 

Estimation, Bartlett's method, 
242, 356 ; geometrical picture, 
218, 243; of a man's factors, 
221, 227, 336 ; of oblique fac- 
tors, 245 ; of scores, 205, 


INDEX 


Examiners, factorial analysis of, 
253 ff. 
Extended vectors, 156, 172. 


Factors—axes, 99 ; bipolar. 139, 
331; danger of reifying, 58 ; 
dependence on population, 
300 ; effect of selection, 285 ff.; 
as fictitious tests, 4; group, 
14; limits to extent of, 167 ; 
limits to number of, 121 ff. ; 
and loadings, reciprocity, 267, 
361 ; oblique, 168, 170 ff., 341, 
362; persons and tests, 263 ff. ; 
primary, definition, 170; ro- 
tation of, 139; second order, 
186 ff. ; specific, 8; standard 
errors, 134 ; verbal, 15. 

Factors, a man's, estimation of, 
221, 242, 353; estimation of 
oblique, 245; principal com- 
ponents, 11! short method 
of estimation, 227 ; and voca- 
tional advice, 231 ff. 

Ferguson, 371. 

Fisher, 39, 132, 211, 371, 372. 


£—definition, 48; distinction 
from other factors, 327; es- 
timation, 221 ff., 359 ; meas- 
uring pure, 50; as mental 
energy, 58 ; saturations, 8, 43. 

Galton, 92. 

Garnett, 63, 74, 312. 

Garrett, 372. 

Geometrical picture, centroid 
analysis, 104; correlation, 
92 ff. ; estimation, 213, 243 ; 
univariate selection, 277. 

Green, 220, 353, 372. 

Group factors, definition, 
saturations, 18. 

Guilford, 122, 


14; 


Harman, 184, 372. 
Harper, 184, 185. 
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Hart, 7, 875. 

Heywood, 51, 187, 338, 372. 

Hierarchical order, 5, 8; appli- 
cation of centroid method, 66; 
and physieal measurements, 
325 ; and selection, 284 ; when 
tests equal persons, 53. 

Histogram, 13; of tetrad-differ- 
ences, 41. 

Hollow staircase pattern, 19. 

Holzinger, 19 ff., 29, 40, 41, 184, 
355, 359, 372, 375. 

Horst, 372. 

Hotelling, 116, 124, 198, 218, 
219, 329, 330, 372; principal 
components method, 109 ff., 
264, 349. 

Householder, 377. 


Independence of units, 331 ff. 
Indeterminacy, 357. 
Inequality of men, 315. 
Inner product, definition, 74. 
Irwin, 372. 

Iterative methods, 119. 


Jeffery, 40, 359, 374. 


Kelley, 17, 46, 112, 217, 283, 373. 
Kent, 185. 


Lacey, 122. 

Landahl, 161, 373. 

Latent root, 113, 124, 265, 350 ; 
definition, 110, 111. 

Lawley, 124, 275, 373; maxi- 
mum likelihood method, 127 ff., 
333, 360. 

Ledermann, 52, 82, 166, 168, 227, 
245, 284, 293, 338, 354, 357, 
358, 360, 367, 373. 

Limits, to extent of factors, 167 ; 
to number of factors, 121 ff. 

Lindquist, 208, 373. 

Loadings, definition, 68, 333 ; 


and factors, reciprocity, 267, 
361; negative, 139, 261, 331 ; 
standard errors, 134 ; see also 
Saturations. 


Mackie, 323, 347, 367, 368, 373. 

MeNemar, 122, 374. 

Matrix, caleulation of reci procal 
209, 354; definition, 8, 144; 
double centred, 268 ; Landahl, 
162; multiplication, 112, 145 ; 
notation, 346 ; orthogonal ro- 
tating, 144; “ powering," 112; 
rank of, 48, 63; special or- 
thogonal, 149 ; trace of, 337 ; 
see also Latent root. 

Maximum likelihood method, 
127 ff., 360. 

Medland, 90, 374. 

Metric, 329. 

Minors of a determinant, 64. 

Monarchie doctrine, 312. 

Moods, analysis of, 261. 

Mosier, 121, 374. 

Moul, 40, 359, 374. 

Multiple correlation, computa- 
tion, 199, 206, 216 ; definition, 
197; with factors, 223 ; with 
g, 10. 

Multiple-factor analysis, 63 ff. 


Negative loadings, 139, 261, 331. 
Normal distribution, 35 ff. 
Normalized scores, definition, 6. 
Normalizing, 159. 

Notation, 345, 346. 

Nuclear clusters, 28. 


Oblique factors, 170 ff., 341, 362 ; 
estimation of, 245, 365. 

Oligarchie doctrine, 312. 

Orthogonal axes, 92, 139 ff. 
151 ff. 

Orthogonal simple structure, 
151 ff. 
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Orthogonality, test of, 171. 
Otis-Kelley, correction for selec- 
tion, 283. 


Oval diagrams, 11, 69, 77, 308. 
Parallel proportional profiles, 
169. 


Parsimony of hypothesis, 15, 339. 

Pattern and structure, 170, 176, 
333, 362. 

Pearson, K., 40, 975 5, 296, 359. 
360, 374. 

Peel, 219, 353, 374. 

Physical measurements and hier- 
archical order, 325. 

Pivotal condensation 
209. 

Pooling Square, 197 ff., 350 ; and 
centroid analysis, 216. 

Price, 317, 374. 

Primary factors, correlation with 
reference vectors, 188 ; defini- 
tion, 170, 176, 363. 

Principal components, Hotelling, 
107, 849; acceleration by 
powering, 112; in common 
factor space, 117; compared 
with maximum likelihood load- 
ings, 135; computation, 108 ff., 
264; Kelley’ s method, 112; 
significance test, 124. 

Probable error, 36. 


> 


, 65, 901, 


Raath, 184, 374. 
Rank of a matrix, 48; defini- 
tion, 65; low reduced, 318; 


reduced, 48; unchanged by 
selection, 291, 


Reciprocal matrix, calculation, 
209. 

Reciprocity of loadings and fac- 
tors, 267 ff., 339, 861. 

Reduced rank of a matrix, 48. 

Reference tests, 45. 


INDEX 


Reference vectors, correlation 
with primary factors, 188; 
definition, 170, 176 ; structure 
on, 333, 362. 

Regression, 195 ff.; equation, 
351; estimates of factors, 353, 
365. 

Regression coefficients, compu- 
tation, 205, 211, 226, 227, 355 ; 
correlation of, 211; covari- 
ances of, 211; geometrical 
picture, 212; Sonet: 199 ; 
Standard error, 9203, 211; 
variances of, 211. 

Reliability, coefficient, 
weighting for battery, 219. 

Residues, 44; significance tests, 
120 ff., 191. 

Reyburn, 122, 
841, 374. 

Richness of tests, 310, 330. 

Rosner, 91. 

Rotation of axes, 139 ff., 330 ; 
Alexander, 79, 140; by ex- 
tended vectors, 157 ; Landahl, 
162; Ledermann, 166; or- 
thogonal, 347; to orthogonal 
simple structure, 154 ff.; 
Spearman, 100 3; Thurstone, 
103. 


37; 


146, 183, 184, 


Sampling error, 33 ff., 121; of 
tetrad differences, 40, 359. 


Sampling theory of ability, 
307 ff., 330, 848, 3607; Dodd, 
371. 

Saturations, 8; see also Load- 
ings. 

Second-order factors, 186 ff., 365. 

Selection, and communalities; 
283; and factor loadings, 
284 ff.; geometrical picture, 
276 ; 


and matrix rank, 291; 
multivariate, 294 ff., 360; 


Otis-Kelley formula, 283 ; and 


INDEX 


partial correlation, 282; and 
simple structure, 303; uni- 
variate, 275 ff. ; and variance 
of differences, 281. 

Sheppard, 95. 

Sign changing in centroid anal- 
ysis, 71, 106. 

Significance, of correlation resi- 
dues, 120 ff., 131; definition, 
36, 121; of principal com- 
ponents, 124. g 

Simple structure, criticisms, 182 ; 
Horst, 372 ; and independence 
of units, 331 ff. ; Ledermann's 
method, 166 ; orthogonal, 151 
If. ; rotation to, 154. 

Singly conforming tests, 55. 

Soper, 360. 

Space, common-factor, 102, 117, 


337; ellipsoidal, 93; spher- 
ical, 97. 

Spearman, passim, 374 

Specific factors, 8, 3: maxi- 


mized, 120, 312. 

Standard deviation, definition, 5. 

Standard error, of correlation 
coefficients, 39; of loadings, 
134; of a tetrad-difference, 
40, 359; of variance, 38; of 
z, 40. 

Standardized scores, definition, 
6. 

Stephenson, 17, 18, 49, 249, 251, 
259, 261, 371, 375. 

Structure and pattern, 170, 176, 
333, 362. 

Sub-pools of the mind, 313. 

Swineford, 122, 375. 


Taylor, 122, 146, 183, 341, 374. 

Tetrad-differences, definition, 12 ; 
distribution, 41; and sampl- 
ing theory, 811, 367; stan- 
dard error, 40, 359. 
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Thompson, J. R., 366, 375. 
Thomson, passim, 375. 

'Trace of a matrix, 337. 
Thurstone, L. L., passim, 376. 
'Thurstone, T. G., 377. 

Tryon, 20, 26, 377. 

Tucker, 121, 377. 

'Turnbull, 345. 

Two-factor theory, 5 ff., 347. 


Unique communalities, condi- 
tions for, 83. 


Variance, absolute, of tests, 326, 
329; analysis of, and multiple 
correlation, 208; definition, 


5, 11; of differences in sam- 
ples, 281; of estimated fac- 


tors, 238; of factor, 78 ; and 
oval diagrams, 11; of regres- 
sion coefficients, 211; of sam- 
ples, 37. 


Vectors, definition, 102; ex- 
tended, 156, 172; reference, 
170. 


Verbal factor, 15. 
Vernon, 122, 377. 
Vocational advice, 231, 355. 


Weighted battery, Spearman’s 
weights, 10. 

Wherry, 203, 377. 

Wilson, 83, 123, 227, 354, 377. 

Wishart, 40, 359, 377. 

Worcester, 83, 123, 377. 

Wright, 317, 377. 


Yates, 182, 372. 
Young, 377. 


z-transformation of correlations, 
39. 
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